get-set-fetch / scraperLinks
Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
☆114Updated 2 years ago
Alternatives and similar repositories for scraper
Users that are interested in scraper are comparing it to the libraries listed below
Sorting:
- House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.☆125Updated 2 years ago
- Browser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other …☆31Updated 3 years ago
- Base Docker images for Apify actors.☆85Updated this week
- Web data extraction tool implemented as chrome extension☆260Updated this week
- Automated functional testing via the Chrome DevTools Protocol. Easy to use and open source. Generates unique CSS and Xpath selectors. Out…☆58Updated 4 years ago
- Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt☆85Updated last year
- Extracts email address from an arbitrary text input.☆64Updated 7 months ago
- Chromium Browser Automation (extension for chrome browser automation).☆125Updated last year
- Standalone puppeteer playground in browser's developer tools.☆233Updated last year
- Node.JS library and cli for scraping websites using Puppeteer (or not) and YAML definitions☆47Updated 2 years ago
- Email automation driven by headless chrome.☆168Updated 4 years ago
- An undetectable browser automation framework 🤖☆34Updated 3 years ago
- A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppetee…☆98Updated 2 years ago
- 🧱 A uniform template to use as a foundation for Puppeteer bot construction.☆69Updated 4 years ago
- Cloud crawler functions for scrapeulous☆45Updated 4 years ago
- Scrapes WordPress data using the WP-JSON API activated by default since WordPress 4.7☆102Updated 2 years ago
- A simple puppeteer wrapper to enable useful plugins with ease☆58Updated this week
- Web data extraction tool implemented as chrome extension with much more features☆47Updated 6 years ago
- PixieBrix browser extension☆87Updated 9 months ago
- A browser extension that lets you find email addresses for any domain with a single click.☆74Updated 8 years ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.☆70Updated 4 years ago
- Use puppeteer in your browser extension☆63Updated 3 years ago
- Hosted web-client for the browserless debugger☆49Updated last month
- A case management app built with Lowdefy.☆32Updated last year
- Library and CLI for automating captcha verification across multiple providers.☆123Updated 5 years ago
- A CLI to post, delete, and manage listings on Craigslist, Letgo, Offerup, and Facebook Marketplace.☆30Updated 6 years ago
- Capture website thumbnails using the urlbox screenshot as a service API in node☆25Updated 11 months ago
- An alternative to sticking that lovely web app into an <iframe> on a corp website☆50Updated 3 years ago
- Automatically monitor and log fan counters from social media(Facebook Pages, Twitter, Instagram, YouTube, Google+, OneSignal, Alexa) usin…☆63Updated 7 years ago
- Parses OTP messages for a verification code and service provider.☆24Updated 2 years ago