bitmakerla / estela
estela, an elastic web scraping cluster πΈ
β175Updated last week
Alternatives and similar repositories for estela:
Users that are interested in estela are comparing it to the libraries listed below
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pacβ¦β256Updated last year
- Scrapy rotation proxy package with advanced functionsβ94Updated 2 years ago
- Scrapy Extension for monitoring spiders execution.β534Updated last month
- Scrapy download handler that can impersonate browser' TLS signatures or JA3 fingerprints.β121Updated 3 months ago
- Web scraping Page Objects core libraryβ96Updated 3 months ago
- β124Updated last year
- Minimal set of tools to conduct stealthy scraping.β153Updated last year
- playwright stealthβ595Updated 6 months ago
- Page Object pattern for Scrapyβ119Updated this week
- Scrapyd on container infrastructureβ13Updated last week
- The Web Scraping Club Free Repositoryβ136Updated 2 months ago
- A test suite of common scraper detection techniques. See how detectable your scraper stack is.β136Updated 2 years ago
- Home of the Ulixee Open Data Platformβ50Updated last month
- β70Updated 10 months ago
- Python SDK for Inngest: Durable functions and workflows in Python, hosted anywhereβ63Updated last week
- Zyte Automatic Extraction integration for Scrapyβ56Updated 2 years ago
- π Intelligent browser header & fingerprint generatorβ343Updated last month
- Pyppeteer integration for Scrapyβ59Updated 3 years ago
- Parsing JavaScript objects into Python data structuresβ201Updated 3 weeks ago
- A blazing fast, async-first, undetectable webscraping/web automation framework based on ultrafunkamsterdam/nodriver. Now with Docker suppβ¦β151Updated this week
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.β422Updated 2 years ago
- Undetected version of the Playwright testing and automation library.β391Updated last week
- Scrapy extension that gives you all the scraping monitoring, alerting, scheduling, and data validation you will need straight out of theβ¦β36Updated 6 months ago
- Clean, filter and sample URLs to optimize data collection β Python & command-line β Deduplication, spam, content and language filtersβ133Updated last month
- Library that helps use puppeteer in scrapy.β52Updated last month
- Super Fast, Super Anti-Detect, and Super Intuitive Web Driverβ48Updated 4 months ago
- Zyte API integration for Scrapyβ37Updated this week
- Comprehensive wrapper and execution manager for the Chrome browser using the Chrome Debugging Protocol.β220Updated last month
- A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteeβ¦β92Updated 2 years ago
- A Scrapy middleware to bypass the CloudFlare's anti-bot protectionβ105Updated 3 years ago