get-set-fetch / scraperLinks
Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
☆112Updated 2 years ago
Alternatives and similar repositories for scraper
Users that are interested in scraper are comparing it to the libraries listed below
Sorting:
- web scraping extension☆84Updated 4 months ago
- Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt☆87Updated last year
- A single tab web browser built with puppeteer. Also, no client-side JS. Viewport is streamed with MJPEG. For realz.☆59Updated 2 years ago
- House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.☆128Updated last week
- Base Docker images for Apify actors.☆89Updated last week
- Browser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other …☆31Updated 3 years ago
- A simple puppeteer wrapper to enable useful plugins with ease☆57Updated this week
- Web data extraction tool implemented as chrome extension☆270Updated this week
- PixieBrix browser extension☆87Updated last year
- A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppetee…☆98Updated 3 years ago
- Chromium Browser Automation (extension for chrome browser automation).☆124Updated last year
- Web data extraction tool implemented as chrome extension with much more features☆46Updated 7 years ago
- Extracts email address from an arbitrary text input.☆64Updated 10 months ago
- 🧱 A uniform template to use as a foundation for Puppeteer bot construction.☆69Updated 4 years ago
- An undetectable browser automation framework 🤖☆35Updated 4 years ago
- Node.JS library and cli for scraping websites using Puppeteer (or not) and YAML definitions☆47Updated 2 years ago
- Email automation driven by headless chrome.☆167Updated 4 years ago
- Building extensible automation. Tideflow is a Realtime, open source workflows execution and monitorization web application.☆114Updated 2 years ago
- Grammarify is a npm package that safely cleans up text that has mispellings, improper capitalization, lexical illusions, among other thin…☆73Updated 2 years ago
- Web scraper using Cloudflare Workers☆26Updated 4 years ago
- Hosted web-client for the browserless debugger☆50Updated 2 months ago
- NodeJs package for generating browser-like headers.☆71Updated 3 years ago
- Automatically monitor and log fan counters from social media(Facebook Pages, Twitter, Instagram, YouTube, Google+, OneSignal, Alexa) usin…☆65Updated 7 years ago
- Utilities and constants shared across Apify projects.☆17Updated last week
- Minimal set of tools to conduct stealthy scraping.☆162Updated 2 years ago
- Evaluate JavaScript on a URL through headless Chrome browser.☆25Updated 4 years ago
- command line Google search and save to JSON☆108Updated 2 years ago
- ☆25Updated 4 years ago
- Use plain HTML to connect your website to Google Sheets☆45Updated 2 years ago
- Google Search SERP Scraper☆120Updated last month