scrapy / xtractmime
https://mimesniff.spec.whatwg.org/ implementation for Python
☆13Updated last year
Alternatives and similar repositories for xtractmime:
Users that are interested in xtractmime are comparing it to the libraries listed below
- A fork of http://pydispatcher.sourceforge.net/ with PyPy support☆16Updated 7 years ago
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- Pluggable DSL that uses pipes to perform a series of linear transformations to extract data☆16Updated 9 months ago
- Commons of stupid, simple Python micro functions. Pull requests very welcome.☆19Updated 3 weeks ago
- Datasette plugin for searching all searchable tables at once☆24Updated 8 months ago
- Build requirements files from setup.py.☆27Updated 2 years ago
- extract difference between two html pages☆32Updated 6 years ago
- Data cleaning and validation functions for names, languages, identifiers, etc.☆20Updated this week
- ☆12Updated 8 years ago
- A scrapy extension to store requests and responses information in storage service☆26Updated 3 years ago
- Server monitoring and data-collection daemon☆10Updated 6 years ago
- Support files exposing JSON from the JSON Schema specifications to Python☆12Updated last week
- Datasette plugin for authenticating access using API tokens☆12Updated 8 months ago
- Symbolic Constants in Python☆23Updated 8 months ago
- Pytest plugin that runs PyStack on slow or hanging tests.☆16Updated 5 months ago
- Scrape various open data directories to create an index of what's available out there☆36Updated 2 months ago
- Code metrics for Python code.☆10Updated 10 years ago
- Scrapy middleware for the autologin☆37Updated 6 years ago
- Web scraping Page Objects core library☆99Updated 2 months ago
- OpenSSF Scorecard for top Python packages☆16Updated this week
- Python module for Named Entity Recognition (NER) using natural language processing.☆13Updated 3 years ago
- py.test plugin for checking requirements files☆22Updated 5 years ago
- A tiny search engine.☆13Updated 2 years ago
- What's in the Python stdlib☆11Updated this week
- Python library for modern thread / multiprocessing pooling and task processing via asyncio☆15Updated 4 years ago
- Python client for Zyte API☆24Updated last month
- Jinja2 extension to handle git-specific things☆16Updated 3 weeks ago
- ☆10Updated 11 months ago
- Restrict crawl and scraping scope using matchers.☆25Updated 8 years ago
- Scrapy middleware which allows to crawl only new content☆80Updated 2 years ago