wanghaisheng / awesome-web-data-extractor
A curated list of promising Web Data Extractors resources
☆28Updated 5 years ago
Alternatives and similar repositories for awesome-web-data-extractor:
Users that are interested in awesome-web-data-extractor are comparing it to the libraries listed below
- PostHog with text analytics extensions, serving as an advanced LLM analytics platform.☆11Updated 5 months ago
- Zyte Automatic Extraction integration for Scrapy☆56Updated 3 years ago
- ☆29Updated 3 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated last year
- Aiohttp web server API, which scrapes Google and returns scrape results as response. Supports proxies, multiple geos and number of result…☆56Updated last year
- URL Inspection Tool Automator☆24Updated 2 years ago
- Demo example of consumer goods categorization☆26Updated last year
- A python library detect and extract listing data from HTML page.☆108Updated 7 years ago
- Processes data from images which are tagged with the specified Instagram tag.☆13Updated 11 years ago
- Library that helps use puppeteer in scrapy.☆52Updated last month
- Pre-built Scrapy spiders for AutoExtract☆19Updated 10 months ago
- Neural Elastic Inference and Search☆19Updated 5 years ago
- Dockerfile and web server for running GPT-J-6B on AWS GPU instances☆18Updated 3 years ago
- More flexible and featured Frontera scheduler for Scrapy☆36Updated 3 months ago
- Integration between Reaction ECommerce and Accelerated Text to provide product descriptions for an e-shop.☆12Updated 4 years ago
- Extract social media links and account names from websites.☆37Updated 4 years ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 6 years ago
- Google Search Results Pages Dashboard☆36Updated 2 years ago
- Open Collaborative AI Driven Parser builder for Web Scraping, Data Extraction and Crawling,Knowledge GraphUpdated last month
- Initiate the awesome keyword research with constant update with practical information gathered daily☆29Updated 7 years ago
- This project experiments with the Google NLP Algorithm to evaluate e-commerce product descriptions from an SEO perspective.☆17Updated 4 years ago
- Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai☆40Updated 2 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆56Updated last year
- Easy extraction of keywords and engines from search engine results pages (SERPs).☆90Updated 3 years ago
- A python implementation of DEPTA☆83Updated 8 years ago
- A script for downloading performance and account structure from Google AdWords API☆18Updated 4 years ago
- Cloud crawler functions for scrapeulous☆45Updated 4 years ago
- A News Article Collection Library☆22Updated last year
- Various Jupyter notebooks about Common Crawl data☆51Updated 2 weeks ago
- SEO Automations Using Data science toolset and more☆13Updated 2 years ago