get-set-fetch / scraper
Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
☆107Updated last year
Related projects ⓘ
Alternatives and complementary repositories for scraper
- House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.☆117Updated last year
- web scraping extension☆72Updated 2 months ago
- 🧱 A uniform template to use as a foundation for Puppeteer bot construction.☆64Updated 3 years ago
- NodeJs package for generating browser-like headers.☆64Updated 2 years ago
- Base Docker images for Apify actors.☆70Updated last week
- Cloud crawler functions for scrapeulous☆44Updated 3 years ago
- Minimal set of tools to conduct stealthy scraping.☆150Updated last year
- Session persistence plugin for puppeteer-extra☆21Updated 2 years ago
- A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppetee…☆90Updated 2 years ago
- A simple puppeteer wrapper to enable useful plugins with ease☆54Updated this week
- Phantombuster's SDK☆14Updated last month
- The ultimate tool for obtaining free proxies from multiple sources and storing them in a MongoDB database.☆22Updated last year
- Web data extraction tool implemented as chrome extension with much more features☆46Updated 6 years ago
- Hosted web-client for the browserless debugger☆45Updated 3 weeks ago
- You can use this act to monitor any page's content and get a notification when content changes.☆19Updated 2 years ago
- A puppeteer-extra plugin to remotely view and interact with puppeteer sessions. Essentially opening a "portal" to the page.☆49Updated last year
- Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt☆77Updated 8 months ago
- 🏴 A straightforward forward-proxy written in Node.js.☆77Updated 6 months ago
- An undetectable browser automation framework 🤖☆29Updated 3 years ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.☆69Updated 3 years ago
- All In One API to easily scrape data from any website, without worrying about captchas and bot detection mecanisms.☆21Updated last year
- Parses OTP messages for a verification code and service provider.☆24Updated last year
- Browser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other …☆28Updated 2 years ago
- A plugin for puppeteer-extra to add proxy support☆17Updated last year
- Standalone puppeteer playground in browser's developer tools.☆210Updated last year
- Aliexpress.com scraper which developed for Apify☆1Updated last year
- Extracts email address from an arbitrary text input.☆62Updated 4 months ago
- Utilities and constants shared across Apify projects.☆12Updated this week
- A browser extension that lets you find email addresses for any domain with a single click.☆68Updated 7 years ago