jessepollak / urlmatchLinks
π₯ A Python library for easily pattern matching wildcard URLs
β40Updated 8 years ago
Alternatives and similar repositories for urlmatch
Users that are interested in urlmatch are comparing it to the libraries listed below
Sorting:
- Fast Indexed python HTML parser which builds a DOM node tree, providing common getElementsBy* functions for scraping, testing, modificatiβ¦β102Updated 2 years ago
- A Python binding of SQLite Full Text Search Tokenizerβ48Updated last week
- URL normalization for Pythonβ99Updated 7 months ago
- β26Updated last year
- CSS related utilities (parsing, serialization, etc) for pythonβ32Updated 2 months ago
- Python Simple Object Storage - provides a list and dictionary interface that seamlessly stores data in a file, like a simplified databaseβ¦β58Updated 2 years ago
- Lightweight library that converts a HTML webpage to JSON data using a template defined in JSON.β23Updated 5 months ago
- Loadable spellfix1 extension for sqlite as python packageβ26Updated last year
- A helper library full of URL-related heuristics.β72Updated 2 months ago
- Find which links on a web page are pagination linksβ29Updated 8 years ago
- Extract text from HTMLβ135Updated 5 years ago
- THIS REPOSITORY IS FORKβ30Updated 2 years ago
- Homoglyphs: get similar letters, convert to ASCII, detect possible languages and UTF-8 group.β84Updated 4 years ago
- Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.β55Updated 10 months ago
- Python WSGI Middleware for adding HTTP/S proxy support to any WSGI Applicationβ24Updated 5 years ago
- Fast multi-keyword search engine for text stringsβ258Updated last year
- A tiny framework for building batch applications as a collection of tasks in a workflow.β23Updated 3 years ago
- Binary Python bindings for poppler utils for content extractionβ42Updated 4 years ago
- Python package for HTTP/1.1 style headers. Parse headers to objects. Most advanced available structure for http headers.β122Updated 3 weeks ago
- Simple CAPTCHA generator for Pythonβ52Updated 5 years ago
- Fast Autocomplete: When Elastcsearch suggestions are not fast and flexible enoughβ287Updated 2 months ago
- A Python library for extracting titles, images, descriptions and canonical urls from HTML.β151Updated 5 years ago
- Scrapy downloader middleware that stores response HTMLs to disk.β18Updated 3 months ago
- JSON microservice for performing HEAD requestsβ34Updated 2 years ago
- CoCrawler is a versatile web crawler built using modern tools and concurrency.β191Updated 3 years ago
- PyQuery-based scraping micro-framework.β118Updated 3 years ago
- Modern robots.txt Parser for Pythonβ196Updated last year
- Python library for extracting text from various file formats (for indexing).β113Updated 3 years ago
- Lightning Fast Language Prediction πβ167Updated 3 months ago
- A Framework For Using HAR Files To Analyze Web Pagesβ155Updated last month