christian-fei / mega-scraperLinks
the mega scraper - scrape a website's content
☆28Updated 5 years ago
Alternatives and similar repositories for mega-scraper
Users that are interested in mega-scraper are comparing it to the libraries listed below
Sorting:
- Robust text renderer using headless chrome.☆66Updated 2 years ago
- A plugin for puppeteer-extra to add proxy support☆18Updated 2 years ago
- 🧱 A uniform template to use as a foundation for Puppeteer bot construction.☆68Updated 4 years ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.☆70Updated 4 years ago
- Simple proxy rotation service☆30Updated 10 years ago
- Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Support…☆112Updated 2 years ago
- Language agnostic named entity recognizer☆41Updated 2 years ago
- Scrape subreddits based on search criteria or get the X latest from 'hot' or 'new' categories☆27Updated 4 years ago
- A `htmlparser2` handler for parsing rich metadata from HTML. Includes HTML metadata, JSON-LD, RDFa, microdata, OEmbed, Twitter cards and …☆56Updated 2 years ago
- Create a stream of Sequelize create, update, and destroy events.☆11Updated 5 years ago
- a puppeteer walker 🕷 🕸☆79Updated 5 years ago
- Naive Bayes Classifier in JavaScript☆32Updated 8 years ago
- Flexible nodejs HMAC authentication module for express/connect and beyond☆36Updated 5 years ago
- Identifies and extracts phone numbers from arbitrary text☆39Updated 8 years ago
- Naive Bayes Text Classifier☆42Updated 10 months ago
- Chromium / Puppeteer site crawler☆48Updated 5 years ago
- Technologies I've learned☆66Updated last month
- Extracts email address from an arbitrary text input.☆64Updated 10 months ago
- Extracts all JSON objects from an arbitrary text document.☆30Updated 5 years ago
- List of words for making random mnemonic sentences☆86Updated last year
- Extracts prices from an arbitrary text input.☆16Updated 6 years ago
- Convert a URL to a valid filename☆80Updated 3 months ago
- Refresh, monitor and balance your proxies☆16Updated 6 months ago
- Friendly web crawler for x-ray☆44Updated 2 years ago
- A sparse array optimised for low memory whilst still being fast☆32Updated 3 years ago
- 🌃 Start and control a Tor instance.☆12Updated 3 years ago
- Simularity identification in JS☆37Updated last year
- A proxy that sits in between a chromium devtools frontend and the remote chromium being debugged and logs requests, responses and websock…☆21Updated 5 years ago
- Chrome binary compatible with AWS Lambda.☆55Updated 6 years ago
- An experimental distributed JWT token cracker built using Node.js and ZeroMQ☆57Updated last year