scrapinghub / adblockparser
Python parser for Adblock Plus filters
☆195Updated 6 years ago
Alternatives and similar repositories for adblockparser:
Users that are interested in adblockparser are comparing it to the libraries listed below
- Modern robots.txt Parser for Python☆190Updated last year
- URL Transformation, Sanitization☆103Updated last year
- Python implementation of the Parsley language for extracting structured data from web pages☆92Updated 7 years ago
- Scrapinghub Command Line Client☆131Updated 9 months ago
- Extract text from HTML☆134Updated 4 years ago
- A generic crawler☆78Updated 6 years ago
- A project to attempt to automatically login to a website given a single seed☆123Updated 2 years ago
- Detect and classify pagination links☆102Updated 4 years ago
- A decorator to write coroutine-like spider callbacks.☆110Updated 2 years ago
- Python library of web-related functions☆397Updated 2 weeks ago
- Python wrapper for RE2☆295Updated last year
- Formasaurus tells you the type of an HTML form and its fields using machine learning☆118Updated 8 months ago
- Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.☆56Updated 2 years ago
- A python module for retrieving and parsing WHOIS data☆402Updated 3 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 7 years ago
- CSS Selectors for Python☆293Updated 3 weeks ago
- pydig: a DNS query tool written in Python☆104Updated 5 months ago
- Paginating the web☆37Updated 11 years ago
- Scriptable Google Chrome™ as a HTTP service + asyncio driver☆118Updated last year
- Scrapy middleware which allows to crawl only new content☆80Updated 2 years ago
- ☆143Updated 9 years ago
- publicsuffixlist for python☆66Updated this week
- extract difference between two html pages☆32Updated 6 years ago
- NER toolkit for HTML data☆259Updated 9 months ago
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- A python implementation of DEPTA☆83Updated 8 years ago
- Retrieve and parse whois data for IPv4 and IPv6 addresses☆562Updated 4 months ago
- URL normalization for Python☆94Updated 2 years ago
- Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)☆204Updated 9 months ago
- Automatic Item List Extraction☆87Updated 8 years ago