zheplusplus / pyunicodeblock
Python Unicode Block Utilities
☆24Updated 4 years ago
Alternatives and similar repositories for pyunicodeblock:
Users that are interested in pyunicodeblock are comparing it to the libraries listed below
- unihandecode is a transliteration library to convert all characters/words in Unicode into ASCII alphabet that aware with Language prefere…☆69Updated 2 years ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆69Updated last month
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- ISO 639 library for Python☆32Updated 6 months ago
- A Python binding of SQLite Full Text Search Tokenizer☆47Updated last month
- bin files☆13Updated last month
- Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)☆17Updated 6 months ago
- 💥 Cython hash tables that assume keys are pre-hashed☆86Updated last month
- Python difflib with parts reimplemented in C☆34Updated last month
- Wikipedia API wrapper for humans and elk. (en.wikipedia.org/w/api.php, get it?)☆36Updated 10 years ago
- Lexical data at Unicode☆67Updated 6 months ago
- A pure Python Levenshtein implementation that's not freaking GPL'd.☆96Updated last year
- A python package for grapheme aware string handling☆110Updated 2 years ago
- Automatically exported from code.google.com/p/guess-language☆53Updated last year
- Multi-Langauge Identification☆29Updated 7 months ago
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- Python port of Boilerpipe library☆15Updated 6 years ago
- An LL parser for extracting information from Wiki text, particularly Wiktionary.☆48Updated last year
- Build a trie-structured regular expression from a list of words☆21Updated 5 years ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆58Updated 6 months ago
- Stanford Tregex-inspired language for rule-based dependency tree manipulation.☆21Updated 7 years ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆65Updated last year
- A Python library for working with and comparing language codes.☆343Updated 2 months ago
- A python package to simulate typographical errors.☆31Updated last year
- An efficient data structure for fast string similarity searches☆22Updated 4 years ago
- Hierarchical phrase-based machine translation system☆32Updated 10 years ago
- Fastest general-purpose parsing library for Python with a familiar API☆44Updated last month
- Stuttgart Finite State Transducer system☆18Updated 4 months ago
- This is an Object Oriented implementation of a Trie in python. The class contains setter and getter methods, and implements several usefu…☆14Updated 7 years ago
- Master repo for the UniMorph project, includes the UniMorph schema and annotated data files☆26Updated 5 years ago