lorien / ioweb
☆35Updated this week
Related projects: ⓘ
- A simple, Qt-Webengine powered web browser with built in functionality for basic scrapy webscraping support.☆106Updated 3 months ago
- Web scraping Page Objects core library☆93Updated 2 months ago
- A browser extension to monitor your spiders deployed on Scrapy Cloud.☆15Updated 3 years ago
- Analyze scraped data☆47Updated 4 years ago
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- ☆53Updated this week
- A micro-framework for asynchronous deep crawls and web scraping with Python☆13Updated last year
- ☆29Updated 3 years ago
- Lightweight library that converts a HTML webpage to JSON data using a template defined in JSON.☆21Updated 3 years ago
- Scrapy middleware which allows to crawl only new content☆79Updated last year
- Zyte Automatic Extraction integration for Scrapy☆55Updated 2 years ago
- https://mimesniff.spec.whatwg.org/ implementation for Python☆14Updated 8 months ago
- A simple python tool that generates a requests/bs4 based web scraper☆26Updated 2 years ago
- Streaming web crawler with WebSocket API☆44Updated last year
- A pure-Python robots.txt parser with support for modern conventions.☆54Updated 3 months ago
- Page Object pattern for Scrapy☆119Updated 2 months ago
- Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.☆56Updated 2 years ago
- Library to populate items using XPath and CSS with a convenient API☆44Updated 3 months ago
- Python client for Zyte API☆19Updated 3 months ago
- Makes sending emails easy and DRY — For Python 3.☆220Updated 3 years ago
- AnyAPI is a library that helps you to write any API wrapper with ease and in pythonic way.☆132Updated 2 years ago
- Asyncio web crawling framework. Work in progress.☆18Updated last month
- Python clients for Zyte AutoExtract API☆39Updated 2 years ago
- A scrapy extension to store requests and responses information in storage service☆26Updated 2 years ago
- Paginating the web☆37Updated 10 years ago
- Simple secure asynchronous message queue☆20Updated 2 months ago
- 🕶 Awesome list of Scrapy tools and libraries☆54Updated 4 years ago
- Advanced news feeds extractor and finder library. Helps to automatically extract news from websites without RSS/ATOM feeds☆76Updated last year
- Library for scraping websites or apis at any scale☆53Updated 7 months ago
- ⇔ IterTable is a Pythonic API for iterating through tabular data formats, including CSV, XLSX, XML, and JSON.☆51Updated last year