kata198 / AdvancedHTMLParserLinks
Fast Indexed python HTML parser which builds a DOM node tree, providing common getElementsBy* functions for scraping, testing, modification, and formatting. Also XPath.
☆102Updated last year
Alternatives and similar repositories for AdvancedHTMLParser
Users that are interested in AdvancedHTMLParser are comparing it to the libraries listed below
Sorting:
- CSS Selectors for Python☆298Updated last month
- PyQuery-based scraping micro-framework.☆117Updated 3 years ago
- CSS related utilities (parsing, serialization, etc) for python☆32Updated 9 months ago
- A simple, Qt-Webengine powered web browser with built in functionality for basic scrapy webscraping support.☆109Updated last year
- A fork of http://pydispatcher.sourceforge.net/ with PyPy support☆16Updated 7 years ago
- URL Transformation, Sanitization☆103Updated last year
- Trio driver for Chrome DevTools Protocol (CDP)☆68Updated 3 years ago
- Python implementation of the Parsley language for extracting structured data from web pages☆92Updated 7 years ago
- Web technology based GUI library for desktop applications☆75Updated 7 years ago
- Small HTTP Server used with Flask and werkzeug☆53Updated 4 years ago
- Use pyppeteer from a Scrapy spider☆59Updated 5 years ago
- Python bindings to the Brotli compression library☆147Updated 9 months ago
- Python type wrappers for Chrome DevTools Protocol (CDP)☆113Updated 2 months ago
- Embed the Duktape JS interpreter in Python☆81Updated 2 years ago
- Scrapy schema validation pipeline and Item builder using JSON Schema☆44Updated 4 years ago
- A decorator to write coroutine-like spider callbacks.☆109Updated 2 years ago
- Detect and classify pagination links☆103Updated 4 years ago
- Extract structured data from HTML and XML documents like a boss.☆49Updated 6 months ago
- Analyze scraped data☆46Updated 5 years ago
- IO of git-style object databases☆224Updated 3 weeks ago
- Scrapy spider middleware to split an item into multiple items using a multi-valued key☆20Updated 8 years ago
- Asyncio web crawling framework. Work in progress.☆19Updated 10 months ago
- Extension to ast that allow ast -> python code generation.☆79Updated 7 years ago
- Scrapinghub Command Line Client☆133Updated 2 months ago
- A Python command line application framework☆120Updated 2 years ago
- A complimentary proxy to help to use SPM with headless browsers☆108Updated 2 years ago
- Pyppeteer integration for Scrapy☆58Updated 4 years ago
- Mobilenium allows you to use Selenium and have access to status codes and HTTP headers, without the need for manual labor.☆20Updated 5 years ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- Async to sync converter☆77Updated last year