brendonboshell / supercrawlerLinks
A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
☆379Updated 2 years ago
Alternatives and similar repositories for supercrawler
Users that are interested in supercrawler are comparing it to the libraries listed below
Sorting:
- Javascript scraping module based on puppeteer for many different search engines...☆561Updated 2 years ago
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.☆430Updated 2 years ago
- Web crawler for Node.JS☆253Updated 7 years ago
- Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English.☆344Updated 6 years ago
- Flexible event driven crawler for node.☆2,142Updated 4 years ago
- Simple, lightweight and expressive web scraping with Node.js☆154Updated 3 years ago
- Node.js module and CLI tool to get proxies from publicly available proxy lists.☆625Updated 3 years ago
- Advanced Node proxy checker (node proxy verifier, node proxy tester) with socks and https support☆109Updated 2 years ago
- Declarative DOM extraction expression evaluator. 👨⚕️☆694Updated 5 years ago
- Web scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)☆500Updated 5 years ago
- simple multi-level scraper json input/output for Cheerio☆199Updated 2 years ago
- Chromium / Puppeteer site crawler☆49Updated 5 years ago
- Puppeteer (Headless Chrome Node API)-based rendering solution.☆538Updated 3 years ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.☆70Updated 4 years ago
- Verify email address checking MX records, and SMTP connection.☆125Updated 4 years ago
- Google Search SERP Scraper☆113Updated 2 years ago
- Node module that summarizes text using a naive summarization algorithm☆769Updated 8 months ago
- Cloud crawler functions for scrapeulous☆45Updated 4 years ago
- Proxies Puppeteer Page requests.☆208Updated 9 months ago
- House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.☆121Updated 2 years ago
- High-performance FlexSearch Server for Node.js (Cluster)☆189Updated 6 years ago
- Email automation driven by headless chrome.☆167Updated 4 years ago
- Automatically extract body content (and other cool stuff) from an html document☆2,158Updated 2 years ago
- A NodeJS implementation of the Rapid Automatic Keyword Extraction algorithm.☆103Updated last year
- A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppetee…☆95Updated 2 years ago
- Unsupervised learning approach to building an article spinner to automatically generate content☆75Updated 7 years ago
- A Node.js module to search and scrape Google.☆454Updated 6 years ago
- Run Puppeteer code in the cloud☆737Updated last year
- Node.js email SMTP verification, powered by EmailChecker.com API☆289Updated 2 years ago
- Get a random user agent (with an optional filter to select from a specific set of user agents).☆257Updated 2 years ago