sandinmyjoints / fold_to_asciiLinks
A Python port of the Apache Lucene ASCII Folding Filter that converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the ‘Basic Latin’ Unicode block) into ASCII equivalents, if they exist.
☆15Updated 5 years ago
Alternatives and similar repositories for fold_to_ascii
Users that are interested in fold_to_ascii are comparing it to the libraries listed below
Sorting:
- Abydos NLP/IR library for Python☆186Updated 2 years ago
- A Cython implementation of the affine gap string distance☆57Updated 2 years ago
- Street address parser and formatter☆91Updated 5 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆151Updated 4 months ago
- A trend viewer written in Python/JavaScript☆21Updated 6 months ago
- A simple fuzzy matching set for python strings☆227Updated 9 months ago
- Python package for Google's diff-match-patch native C++ implementation.☆77Updated 11 months ago
- A Python implementation of Lunr.js 🌖☆196Updated 3 months ago
- Language detection using Spacy and Fasttext☆55Updated last year
- Super-fast and clean conversions to numbers for Python.☆109Updated 3 months ago
- 💥 Cython hash tables that assume keys are pre-hashed☆87Updated last week
- Python search module for fast approximate string matching☆54Updated 2 years ago
- ISO 20275☆10Updated last year
- Python binding for gumbo-parser using Cython☆14Updated 8 years ago
- Validation and data pipelines made easy!☆12Updated 5 years ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆66Updated 2 years ago
- An asynchronous SPARQL client library using aiohttp☆25Updated 9 months ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- A comprehensive and scalable set of string tokenizers and similarity measures in Python☆139Updated 10 months ago
- Python Solr query utility // http://solrq.readthedocs.org/en/latest/☆25Updated 2 years ago
- URL normalization for Python☆95Updated last month
- A Python library for working with and comparing language codes.☆346Updated last month
- Extract, parse and populate templates from strings☆27Updated 6 years ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆62Updated this week
- gametight lightweight caching library for python☆64Updated 2 years ago
- Streaming newline delimited JSON I/O.☆12Updated last year
- Hunspell extension for spaCy 2.0.☆94Updated 10 months ago
- A natural language search microservice☆95Updated 4 years ago
- Language detection extension for spaCy 2.0+☆112Updated 6 years ago
- DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library.☆303Updated 11 months ago