Girbons / mercury-parserpy
python api wrapper for https://mercury.postlight.com/web-parser/
☆23Updated last year
Related projects ⓘ
Alternatives and complementary repositories for mercury-parserpy
- Common interface for data container classes☆62Updated this week
- Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.☆56Updated 2 years ago
- Library to populate items using XPath and CSS with a convenient API☆45Updated last month
- Atom, RSS and JSON feed parser for Python 3☆116Updated 2 years ago
- Extract clean(er), readable text from web pages via Mercury Web Parser.☆117Updated 4 months ago
- Extract text from HTML☆132Updated 4 years ago
- Scrapy downloader middleware that stores response HTMLs to disk.☆18Updated 6 months ago
- Scrapy middleware which allows to crawl only new content☆79Updated 2 years ago
- This is the HeadQuarters of my digital info. HPI library got me inspired and I'm trying to play with the idea on a smaller scale for myse…☆19Updated last year
- The most advanced debugging and testing tool for Scrapy☆16Updated last year
- URL normalization for Python☆94Updated 2 years ago
- Tools to easy generate RSS feed that contains each scraped item using Scrapy framework.☆31Updated this week
- A micro-framework for asynchronous deep crawls and web scraping with Python☆13Updated last year
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- A Python library for finding feed links on websites.☆50Updated 2 years ago
- A simple, Qt-Webengine powered web browser with built in functionality for basic scrapy webscraping support.☆106Updated 6 months ago
- Flask extension for sendgrid. It has same interface with Flask-Mail.☆16Updated last year
- Spider templates for automatic crawlers.☆24Updated this week
- linkbak is a web page archiver : it reads a list of links and dumps the corresponding pages in HTML and PDF.☆14Updated last year
- Zyte Automatic Extraction integration for Scrapy☆55Updated 2 years ago
- Web scraping Page Objects core library☆95Updated last month
- quickly create UIs to interactively prompt, validate, and persist python objects to disk (JSON/YAML) and back using type hints☆13Updated last month
- Save an RSS or ATOM feed to a SQLite database☆47Updated 2 years ago
- The Temboz RSS/Atom feed reader☆82Updated last year
- cli for evaluating css and xpath selectors☆26Updated last year
- Python library for finding phone numbers in random user input text.☆9Updated 7 years ago
- A pure-Python robots.txt parser with support for modern conventions.☆55Updated last week
- A scrapy extension to store requests and responses information in storage service☆26Updated 2 years ago
- URL Transformation, Sanitization☆103Updated 10 months ago