Data Collection

Crawler Module

Extract the visible content of any public webpage and pass the text downstream for analysis, summarization, or AI workflows.

Provider
Netflow

Overview

The Crawler Module allows you to input a URL and automatically crawl through multiple linked pages on the same domain. This makes it ideal for gathering larger datasets such as product catalogs, blog archives, or paginated listings.

Key Features

Single-Page Scraping
Quick & Easy Setup
Reliable Extraction

Use Cases

Content Summarization
Data Collection
Knowledge Base Building
Trend Monitoring

User Manual

1. Drag & Drop

Drag & Drop
Add the Crawler module to your workspace from the sidebar.

2. Enter URL

Enter URL
Type the full URL of the site or directory you want to crawl. Example: https://example.com/products/

3. Run the Module

Run the Module
Click the button to start crawling.

4. View the Result

Find the output in the Crawler Output section.

Quick Info

Category
Data Collection
Provider
Netflow
Module Type

Tags

Web ScrapingAutomationData Extraction