bitmakerla / estelaLinks
estela, an elastic web scraping cluster πΈ
β188Updated this week
Alternatives and similar repositories for estela
Users that are interested in estela are comparing it to the libraries listed below
Sorting:
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pacβ¦β292Updated 3 months ago
- Scrapy rotation proxy package with advanced functionsβ95Updated 3 years ago
- The Web Scraping Club Free Repositoryβ151Updated 3 months ago
- Lego AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements.β236Updated last year
- Use AWS Lambda functions as a proxy pool to scrape web pages.β137Updated last year
- Clean, filter and sample URLs to optimize data collection β Python & command-line β Deduplication, spam, content and language filtersβ145Updated 8 months ago
- Home of the Ulixee Open Data Platformβ55Updated 3 months ago
- β137Updated last year
- Page Object pattern for Scrapyβ122Updated last week
- Minimal set of tools to conduct stealthy scraping.β160Updated 2 years ago
- Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.β431Updated 2 years ago
- dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decoratorsβ428Updated 5 months ago
- Web scraping Page Objects core libraryβ101Updated last week
- Library that helps use puppeteer in scrapy.β52Updated last month
- Scrapy Extension for monitoring spiders execution.β546Updated 4 months ago
- Scrapy extension that gives you all the scraping monitoring, alerting, scheduling, and data validation you will need straight out of theβ¦β37Updated last year
- Comprehensive wrapper and execution manager for the Chrome browser using the Chrome Debugging Protocol.β227Updated 3 months ago
- β75Updated 2 months ago
- Get structured JSON data from any page.β177Updated last year
- Scrapy project boilerplate done rightβ48Updated 6 months ago
- Spider templates for automatic crawlers.β31Updated 2 months ago
- playwright stealthβ772Updated last year
- π·οΈ Scrapyd is an application for deploying and running Scrapy spiders.β86Updated last month
- Spider ported to Pythonβ91Updated 7 months ago
- Confidence and Byt5 - based geotagging model predicting coordinates from text alone.β163Updated 8 months ago
- Make sense of it all. Semantic data modeling and analytics with a sprinkle of AI. https://totalhack.github.io/zillion/β203Updated 3 months ago
- Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.β70Updated 4 years ago
- Extract price amount and currency symbol from a raw text stringβ336Updated 6 months ago
- π Web scraping for humansβ917Updated 9 months ago
- Parsing JavaScript objects into Python data structuresβ212Updated 3 weeks ago