get-set-fetch / scraperLinks
Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.
☆114Updated 2 years ago
Alternatives and similar repositories for scraper
Users that are interested in scraper are comparing it to the libraries listed below
Sorting:
- A single tab web browser built with puppeteer. Also, no client-side JS. Viewport is streamed with MJPEG. For realz.☆56Updated 2 years ago
- web scraping extension☆84Updated 3 weeks ago
- Browser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other …☆31Updated 3 years ago
- House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.☆123Updated 2 years ago
- Base Docker images for Apify actors.☆82Updated this week
- Web data extraction tool implemented as chrome extension☆261Updated 2 weeks ago
- Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt☆84Updated last year
- 🧱 A uniform template to use as a foundation for Puppeteer bot construction.☆68Updated 4 years ago
- Extracts email address from an arbitrary text input.☆64Updated 6 months ago
- PixieBrix browser extension☆85Updated 8 months ago
- A tutorial for web scraping using Playwright headless browser☆128Updated last month
- Web data extraction tool implemented as chrome extension with much more features☆47Updated 6 years ago
- An undetectable browser automation framework 🤖☆34Updated 3 years ago
- A simple puppeteer wrapper to enable useful plugins with ease☆57Updated this week
- Grammarify is a npm package that safely cleans up text that has mispellings, improper capitalization, lexical illusions, among other thin…☆73Updated 2 years ago
- Standalone puppeteer playground in browser's developer tools.☆230Updated last year
- Node.JS library and cli for scraping websites using Puppeteer (or not) and YAML definitions☆45Updated 2 years ago
- Cloud crawler functions for scrapeulous☆45Updated 4 years ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.☆70Updated 4 years ago
- An alternative to sticking that lovely web app into an <iframe> on a corp website☆50Updated 3 years ago
- A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppetee…☆97Updated 2 years ago
- Evaluate JavaScript on a URL through headless Chrome browser.☆25Updated 4 years ago
- A browser extension that lets you find email addresses for any domain with a single click.☆73Updated 8 years ago
- Building extensible automation. Tideflow is a Realtime, open source workflows execution and monitorization web application.☆116Updated 2 years ago
- DronaHQ offers a low-code platform to build internal tools. Drag-and-drop UI components and connect them to your databases and APIs to bu…☆59Updated 3 months ago
- A puppeteer-extra plugin to remotely view and interact with puppeteer sessions. Essentially opening a "portal" to the page.☆53Updated 2 years ago
- Instagram automation driven by headless chrome.☆119Updated 2 years ago
- A self-hosted dashboard and API to share service ports with the team.☆32Updated 2 years ago
- HTML template editor for quickly working with handlebars and liquid templates.☆16Updated 2 years ago
- Tools and Images to Build a Raspberry Pi n8n server☆76Updated 3 years ago