A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
☆381Dec 30, 2022Updated 3 years ago
Alternatives and similar repositories for supercrawler
Users that are interested in supercrawler are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Web crawler for Node.JS☆257Apr 15, 2026Updated 2 months ago
- Flexible event driven crawler for node.☆2,134Mar 7, 2021Updated 5 years ago
- Web Crawler/Spider for NodeJS + server-side jQuery ;-)☆6,794Jun 18, 2026Updated last week
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆194Apr 29, 2022Updated 4 years ago
- Distributed crawler powered by Headless Chrome☆5,643Apr 29, 2023Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A collection of awesome web crawler,spider in different languages☆7,239Jun 16, 2024Updated 2 years ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.☆70Jun 8, 2021Updated 5 years ago
- A scalable frontier for web crawlers☆1,330Jun 6, 2025Updated last year
- Click heatmaps with Google Analytics☆36Aug 21, 2013Updated 12 years ago
- Visually diff websites☆20Jan 22, 2018Updated 8 years ago
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.☆437Dec 30, 2022Updated 3 years ago
- The next web scraper. See through the <html> noise.☆5,904May 6, 2026Updated last month
- 🔮 A Node.js scraper for humans.☆4,073Oct 13, 2025Updated 8 months ago
- ☆10Dec 23, 2019Updated 6 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Generate an object for testing if a request is sent, request is Mikeal's request.☆43Oct 15, 2020Updated 5 years ago
- Automatically extracts structured information from webpages☆111Jun 23, 2022Updated 4 years ago
- Chromium / Puppeteer site crawler☆48Mar 30, 2020Updated 6 years ago
- Basic integration between GraphQL-RxJs & GraphQL-Transport-WS☆12Nov 3, 2017Updated 8 years ago
- Puppeteer Pool, run a cluster of instances in parallel☆3,514Mar 1, 2026Updated 4 months ago
- Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head☆176May 19, 2020Updated 6 years ago
- ExtractContent for node.js☆15Apr 14, 2026Updated 2 months ago
- A simple collaborative textfield for nodejs☆18Apr 26, 2017Updated 9 years ago
- Javascript scraping module based on puppeteer for many different search engines...☆568Dec 30, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 🔎 📖 ✨ Custom, private search engine for text documents built with NextJS/React/ES6/ES7☆32Mar 20, 2025Updated last year
- Lightweight API for YouTube (Google API v3)☆16Dec 6, 2025Updated 6 months ago
- Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data …☆24,227Updated this week
- Broad crawler for domain discovery☆20Apr 8, 2026Updated 2 months ago
- Schema.io + Node API starter kit☆15Nov 26, 2017Updated 8 years ago
- The simple, easy to use command line web crawler.☆354Aug 8, 2024Updated last year
- 📊 Repository for the study on 11.8 Million Google Search Results☆28Mar 11, 2020Updated 6 years ago
- Throw JavaScript objects at the index and they will become retrievable by their properties using promises and map-reduce☆20Aug 8, 2025Updated 10 months ago
- IRL version of Chrome Offline T-Rex game☆12Apr 16, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Scrapoxy has been discontinued.☆2,414Feb 7, 2026Updated 4 months ago
- ☆16Jun 6, 2025Updated last year
- A Node.js module to search and scrape Google.☆456Oct 4, 2018Updated 7 years ago
- NER toolkit for HTML data☆259May 3, 2024Updated 2 years ago
- 📦 A set of small and performant JS and Twig components☆12Jun 11, 2026Updated 2 weeks ago
- CQRS example with Go, MySQL, NATS, ElasticSearch☆11Jun 1, 2018Updated 8 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆53Jun 12, 2020Updated 6 years ago