lorien / awesome-web-scraping
List of libraries, tools and APIs for web scraping and data processing.
☆6,892Updated last month
Alternatives and similar repositories for awesome-web-scraping:
Users that are interested in awesome-web-scraping are comparing it to the libraries listed below
- A collection of awesome web crawler,spider in different languages☆6,628Updated 8 months ago
- Lightweight, scriptable browser as a service with an HTTP API☆4,123Updated 6 months ago
- Visual scraping for Scrapy☆9,352Updated 7 months ago
- Web Scraping Framework☆2,401Updated 11 months ago
- Scrapy+Splash for JavaScript integration☆3,182Updated last week
- A curated list of awesome packages, articles, and other cool resources from the Scrapy community.☆544Updated 2 years ago
- A list of (almost) all headless web browsers in existence☆6,311Updated 8 months ago
- 🔍 A helpful checklist/collection of Search Engine Optimization (SEO) tips and techniques.☆2,500Updated 9 months ago
- Tools of The Trade, from Hacker News.☆16,665Updated 6 months ago
- Random proxy middleware for Scrapy☆1,664Updated 5 years ago
- HTTP API for Scrapy spiders☆849Updated 7 months ago
- A Powerful Spider(Web Crawler) System in Python.☆16,541Updated 9 months ago
- The definitive list of lists (of lists) curated on GitHub and elsewhere☆10,179Updated last month
- admin ui for scrapy/open source scrapinghub☆2,751Updated last year
- A curated list of SEO (Search Engine Optimization) links.☆680Updated this week
- A service daemon to run Scrapy spiders☆3,002Updated 2 weeks ago
- Scrapy, a fast high-level web crawling & scraping framework for Python.☆54,236Updated this week
- A collaborative list of great resources about RESTful API architecture, development, test, and performance☆3,686Updated last week
- The most awesome list about bots ⭐️🤖☆3,883Updated 7 months ago
- 📚 A collection of open and closed source Content Management Systems (CMS) for your perusal.☆2,973Updated 3 months ago
- Design and development guides☆2,244Updated 5 months ago
- Pythonic HTML Parsing for Humans™☆13,780Updated 10 months ago
- A scalable frontier for web crawlers☆1,307Updated 2 weeks ago
- A curated list of awesome command-line frameworks, toolkits, guides and gizmos. Inspired by awesome-php.☆33,773Updated 6 months ago
- This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.☆1,189Updated last year
- Tools for building bots☆1,397Updated 11 months ago
- A pure-python HTML screen-scraping library☆1,870Updated 2 years ago
- A curated list of awesome puppeteer resources.☆2,439Updated 7 months ago
- A curated list of awesome JSON datasets that don't require authentication.☆3,369Updated 2 months ago
- 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and mor…☆23,271Updated last week