get-set-fetch / scraper
Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
β110Updated last year
Alternatives and similar repositories for scraper:
Users that are interested in scraper are comparing it to the libraries listed below
- All In One API to easily scrape data from any website, without worrying about captchas and bot detection mecanisms.β21Updated last year
- web scraping extensionβ80Updated 5 months ago
- 𧱠A uniform template to use as a foundation for Puppeteer bot construction.β65Updated 3 years ago
- A single tab web browser built with puppeteer. Also, no client-side JS. Viewport is streamed with MJPEG. For realz.β56Updated last year
- A simple puppeteer wrapper to enable useful plugins with easeβ55Updated this week
- Base Docker images for Apify actors.β75Updated this week
- Cloud crawler functions for scrapeulousβ45Updated 4 years ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.β69Updated 3 years ago
- An undetectable browser automation framework π€β31Updated 3 years ago
- House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.β120Updated last year
- Object storage microservice. Like minio but minnier.β9Updated 5 years ago
- A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteeβ¦β94Updated 2 years ago
- Generates realistic browser fingerprintsβ72Updated 2 years ago
- Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAntβ80Updated 11 months ago
- Utilities and constants shared across Apify projects.β13Updated this week
- An alternative to sticking that lovely web app into an <iframe> on a corp websiteβ51Updated 3 years ago
- Web crawling & scraping framework for Node.js on top of headless Chrome browserβ19Updated last year
- A free, open source tool to lookup user identities by email addressβ35Updated 8 months ago
- Minimal set of tools to conduct stealthy scraping.β154Updated last year
- NodeJs package for generating browser-like headers.β67Updated 2 years ago
- Home of fingerprint injector.β67Updated 2 years ago
- Email automation driven by headless chrome.β163Updated 4 years ago
- Instagram automation driven by headless chrome.β117Updated 2 years ago
- The ultimate tool for obtaining free proxies from multiple sources and storing them in a MongoDB database.β22Updated last year
- Browser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other β¦β29Updated 3 years ago
- Web data extraction tool implemented as chrome extension with much more featuresβ46Updated 6 years ago
- πΊ Humanizer functions for Puppeteerβ36Updated last year
- Phantombuster's SDKβ14Updated 4 months ago
- A browser extension that lets you find email addresses for any domain with a single click.β71Updated 7 years ago
- A case management app built with Lowdefy.β32Updated last year