madisonmay / CommonRegexLinks
A collection of common regular expressions bundled with an easy to use interface.
☆1,582Updated 2 years ago
Alternatives and similar repositories for CommonRegex
Users that are interested in CommonRegex are comparing it to the libraries listed below
Sorting:
- Find dates inside text using Python and get back datetime objects☆665Updated last year
- A simple Python module for parsing human names into their individual components☆697Updated last year
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,172Updated last week
- Python address detector and parser☆213Updated last year
- spellchecking library for python☆614Updated 2 months ago
- Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).☆1,954Updated last month
- Web Content Retrieval for Humans™☆632Updated 3 years ago
- Parse human-readable date/time strings☆708Updated last month
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆378Updated 2 years ago
- a python library for parsing unstructured western names into name components.☆612Updated 6 months ago
- A toolkit for making domain-specific probabilistic parsers☆805Updated last year
- A simple fuzzy matching set for python strings☆230Updated last year
- Heuristic based boilerplate removal tool☆806Updated 9 months ago
- The simplest way to extract text from PDFs in Python☆428Updated 3 years ago
- python parser for human readable dates☆2,750Updated last month
- Python 2.7 Regular Expression cheatsheet, as a restructured text document and Makefile to convert it to PDF☆517Updated last year
- A fast and friendly PDF scraping library.☆783Updated 2 years ago
- Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.☆632Updated 4 years ago
- [not actively maintained] A lightweight Python library that uses Webkit to enable easy scraping of dynamic, Javascript-heavy web pages☆532Updated 8 years ago
- Fixes mojibake and other glitches in Unicode text, after the fact.☆3,989Updated last year
- Clean personally identifiable information from dirty dirty text.☆416Updated 2 years ago
- pyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance.☆250Updated 2 months ago
- next generation web crawling using machine intelligence☆331Updated 2 years ago
- extract text from any document. no muss. no fuss.☆4,382Updated last year
- Delorean: Time Travel Made Easy☆1,836Updated 2 years ago
- Extracts the top level domain (TLD) from the URL given.☆181Updated 6 months ago
- Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages☆542Updated 4 years ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆1,276Updated 4 years ago
- Port of Google's language-detection library to Python.☆1,856Updated 9 months ago
- Python bindings to libpostal for fast international address parsing/normalization☆858Updated last month