brendonboshell / supercrawlerLinks
A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
☆382Updated 3 years ago
Alternatives and similar repositories for supercrawler
Users that are interested in supercrawler are comparing it to the libraries listed below
Sorting:
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.☆436Updated 3 years ago
- Email automation driven by headless chrome.☆167Updated 5 years ago
- Web crawler for Node.JS☆257Updated 7 years ago
- Web scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)☆497Updated 5 years ago
- House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.☆128Updated this week
- Blazingly fast, multi tenant, faceted search API☆313Updated 5 years ago
- Google Search SERP Scraper☆129Updated last month
- Node.js email SMTP verification, powered by EmailChecker.com API☆297Updated 3 months ago
- Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSO…☆153Updated 2 years ago
- simple multi-level scraper json input/output for Cheerio☆199Updated 3 years ago
- Javascript scraping module based on puppeteer for many different search engines...☆566Updated 3 years ago
- Verify email address checking MX records, and SMTP connection.☆123Updated 4 years ago
- Nodejs lib to parse Google SERP html pages☆46Updated 2 years ago
- Google search scraper with captcha solving support☆90Updated 6 years ago
- Chromium / Puppeteer site crawler☆48Updated 5 years ago
- plugin to extract keywords and key-phrases☆337Updated last year
- Automatically extract body content (and other cool stuff) from an html document☆2,163Updated 2 years ago
- Node module that summarizes text using a naive summarization algorithm☆770Updated 2 weeks ago
- A Node.js module to search and scrape Google.☆456Updated 7 years ago
- Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English.☆346Updated 7 years ago
- A light, fast and flexible javascript tracking library☆263Updated 2 years ago
- A look at how LinkedIn spies on its users.☆832Updated 7 years ago
- Example project demonstrating Headless Chrome + Puppeteer running in their own individual containers.☆70Updated 3 years ago
- Easily create XML sitemaps for your website.☆452Updated last year
- Cloud crawler functions for scrapeulous☆45Updated 4 years ago
- Small tool to wait that all xhr are finished in puppeteer☆278Updated 2 weeks ago
- Simple, lightweight and expressive web scraping with Node.js☆153Updated 4 years ago
- A modern, minimalist, and lightweight URL shortener using Node.js and Redis☆269Updated 7 years ago
- International sales tax calculator for Node (offline, but provides optional online VAT number fraud check). Tax rates are kept up-to-dat…☆330Updated 3 months ago
- Node.JS library and cli for scraping websites using Puppeteer (or not) and YAML definitions☆49Updated 3 years ago