bitmakerla / estelaLinks
estela, an elastic web scraping cluster πΈ
β187Updated last week
Alternatives and similar repositories for estela
Users that are interested in estela are comparing it to the libraries listed below
Sorting:
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pacβ¦β291Updated 2 months ago
- Scrapy rotation proxy package with advanced functionsβ95Updated 3 years ago
- The Web Scraping Club Free Repositoryβ148Updated 3 months ago
- Lego AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements.β236Updated last year
- Web scraping Page Objects core libraryβ101Updated last month
- Page Object pattern for Scrapyβ122Updated last month
- Home of the Ulixee Open Data Platformβ55Updated 2 months ago
- Minimal set of tools to conduct stealthy scraping.β159Updated 2 years ago
- Library that helps use puppeteer in scrapy.β52Updated last week
- Scrapy project boilerplate done rightβ48Updated 5 months ago
- A python based HTML to text conversion library, command line client and Web service.β315Updated this week
- Scrapy extension that gives you all the scraping monitoring, alerting, scheduling, and data validation you will need straight out of theβ¦β37Updated last year
- Use AWS Lambda functions as a proxy pool to scrape web pages.β135Updated last year
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.β70Updated 4 years ago
- Clean, filter and sample URLs to optimize data collection β Python & command-line β Deduplication, spam, content and language filtersβ142Updated 7 months ago
- Scrapy Extension for monitoring spiders execution.β546Updated 4 months ago
- β74Updated last month
- Dockerized FastAPI wrapper around the recognize-anything image recognition modelsβ25Updated last year
- Zyte Automatic Extraction integration for Scrapyβ56Updated 3 years ago
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.β431Updated 2 years ago
- Get structured JSON data from any page.β177Updated last year
- β136Updated last year
- Make sense of it all. Semantic data modeling and analytics with a sprinkle of AI. https://totalhack.github.io/zillion/β203Updated 2 months ago
- dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decoratorsβ429Updated 4 months ago
- playwright stealthβ747Updated last year
- A complimentary proxy to help to use SPM with headless browsersβ108Updated 2 years ago
- Spider templates for automatic crawlers.β30Updated last month
- Parsing JavaScript objects into Python data structuresβ212Updated last week
- Python SDK for Inngest: Durable functions and workflows in Python, hosted anywhereβ115Updated this week
- π Web scraping for humansβ915Updated 8 months ago