A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
☆382Dec 30, 2022Updated 3 years ago
Alternatives and similar repositories for supercrawler
Users that are interested in supercrawler are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Flexible event driven crawler for node.☆2,133Mar 7, 2021Updated 5 years ago
- Web Crawler/Spider for NodeJS + server-side jQuery ;-)☆6,790May 28, 2025Updated 11 months ago
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆193Apr 29, 2022Updated 4 years ago
- Distributed crawler powered by Headless Chrome☆5,653Apr 29, 2023Updated 3 years ago
- A scalable, mature and versatile web crawler based on Apache Storm☆976Apr 21, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.☆70Jun 8, 2021Updated 4 years ago
- A scalable frontier for web crawlers☆1,329Jun 6, 2025Updated 10 months ago
- Visually diff websites☆20Jan 22, 2018Updated 8 years ago
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.☆440Dec 30, 2022Updated 3 years ago
- The next web scraper. See through the <html> noise.☆5,905Feb 16, 2026Updated 2 months ago
- 🔮 A Node.js scraper for humans.☆4,072Oct 13, 2025Updated 6 months ago
- ☆10Dec 23, 2019Updated 6 years ago
- Automatically extracts structured information from webpages☆111Jun 23, 2022Updated 3 years ago
- A job scraper using the Scrapy framework☆16Oct 20, 2017Updated 8 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Django app to manage musics, users and their favourite musics☆11May 24, 2019Updated 6 years ago
- Chromium / Puppeteer site crawler☆48Mar 30, 2020Updated 6 years ago
- Puppeteer Pool, run a cluster of instances in parallel☆3,515Mar 1, 2026Updated 2 months ago
- Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head☆174May 19, 2020Updated 5 years ago
- ExtractContent for node.js☆15Apr 14, 2026Updated 2 weeks ago
- Search Engine API with Node/Express and Puppeteer using Google Search☆12Dec 11, 2022Updated 3 years ago
- Deprecated. Use https://github.com/no-shot/env instead!☆11May 31, 2021Updated 4 years ago
- A simple collaborative textfield for nodejs☆18Apr 26, 2017Updated 9 years ago
- Easily create XML sitemaps for your website.☆452Jun 28, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Javascript scraping module based on puppeteer for many different search engines...☆571Dec 30, 2022Updated 3 years ago
- 🔎 📖 ✨ Custom, private search engine for text documents built with NextJS/React/ES6/ES7☆32Mar 20, 2025Updated last year
- A CDN/API service for Undraw, the MIT-licensed illustrations by Katerina Limpitsouni☆12Aug 3, 2019Updated 6 years ago
- Broad crawler for domain discovery☆20Apr 8, 2026Updated 3 weeks ago
- IRL version of Chrome Offline T-Rex game☆12Apr 16, 2025Updated last year
- Scrapoxy has been discontinued.☆2,422Feb 7, 2026Updated 2 months ago
- A Node.js module to search and scrape Google.☆456Oct 4, 2018Updated 7 years ago
- NER toolkit for HTML data☆259May 3, 2024Updated last year
- 📦 A set of small and performant JS and Twig components☆11Apr 22, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- CQRS example with Go, MySQL, NATS, ElasticSearch☆11Jun 1, 2018Updated 7 years ago
- A simple and fully customizable web crawler/spider for Node.js with server-side DOM. Comes with elegant and hell-simple APIs.☆25Jul 27, 2021Updated 4 years ago
- This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.☆1,228Nov 7, 2023Updated 2 years ago
- Ultimate Website Sitemap Parser☆249Jan 25, 2026Updated 3 months ago
- A complete and versatile web scraper.☆3,718Oct 18, 2020Updated 5 years ago
- Command line tool to write to x86 boot flash chips via the PCH☆14Mar 30, 2017Updated 9 years ago
- A page scraping DSL for extracting structured information from unstructured XHTML, built on Node.js and jQuery☆49Jan 9, 2015Updated 11 years ago