bitmakerla / estela
estela, an elastic web scraping cluster ๐ธ
โ180Updated last month
Alternatives and similar repositories for estela:
Users that are interested in estela are comparing it to the libraries listed below
- Page Object pattern for Scrapyโ121Updated 2 months ago
- Scrapy rotation proxy package with advanced functionsโ95Updated 2 years ago
- Scrapy download handler that can impersonate browser' TLS signatures or JA3 fingerprints.โ145Updated last month
- โ74Updated 2 months ago
- Web scraping Page Objects core libraryโ99Updated 2 months ago
- The Web Scraping Club Free Repositoryโ139Updated 5 months ago
- Scrapy Extension for monitoring spiders execution.โ540Updated last week
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pacโฆโ275Updated last year
- Library that helps use puppeteer in scrapy.โ52Updated last week
- Minimal set of tools to conduct stealthy scraping.โ156Updated 2 years ago
- Home of the Ulixee Open Data Platformโ50Updated 4 months ago
- Super Fast, Super Anti-Detect, and Super Intuitive Web Driverโ60Updated last week
- Zyte Automatic Extraction integration for Scrapyโ56Updated 3 years ago
- A fork of https://github.com/AtuboDad/playwright_stealthโ83Updated 3 weeks ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.โ70Updated 3 years ago
- A test suite of common scraper detection techniques. See how detectable your scraper stack is.โ137Updated 2 years ago
- A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver. Now with Docker suppโฆโ370Updated this week
- ๐ญ Intelligent browser header & fingerprint generatorโ508Updated last month
- Patching CDP (Chrome DevTools Protocol) leaks on OS level. Easy to use with Playwright, Selenium, and other web automation tools.โ112Updated 8 months ago
- Scrapy extension that gives you all the scraping monitoring, alerting, scheduling, and data validation you will need straight out of theโฆโ36Updated 9 months ago
- A drop-in replacement for playwright-python patched with rebrowser-patches. It allows to pass modern automation detection tests.โ63Updated 4 months ago
- use multiple proxies with Scrapyโ756Updated 2 years ago
- Pyppeteer integration for Scrapyโ58Updated 4 years ago
- Zyte API integration for Scrapyโ38Updated last week
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.โ429Updated 2 years ago
- A blazing-fast Python HTTP Client with TLS fingerprintโ287Updated this week
- ๐ฎ Vindicate non-organic web traffic via MITM proxyโ54Updated 9 months ago
- Scrapy + Puppeteerโ111Updated 3 years ago
- Clean, filter and sample URLs to optimize data collection โ Python & command-line โ Deduplication, spam, content and language filtersโ137Updated 3 months ago
- playwright stealthโ656Updated 8 months ago