jacksonllee / iso639Links
ISO 639 language codes
☆50Updated last month
Alternatives and similar repositories for iso639
Users that are interested in iso639 are comparing it to the libraries listed below
Sorting:
- A Python library for working with and comparing language codes.☆353Updated 8 months ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆76Updated this week
- Accurately find/replace/remove emojis in text strings☆163Updated 2 years ago
- Pythonic search engine based on PyLucene.☆131Updated last week
- ☆176Updated 9 months ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆155Updated 2 years ago
- Hy-phen-ation made easy☆218Updated this week
- A python package to simulate typographical errors.☆38Updated 2 years ago
- Check for multiple patterns in a single string at the same time: a fast Aho-Corasick algorithm for Python☆218Updated 2 weeks ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆26Updated last month
- Python Finite-State Toolkit☆60Updated last week
- Cython wrapper on Hunspell Dictionary☆66Updated last year
- Confection: the sweetest config system for Python☆192Updated 3 weeks ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆181Updated 7 months ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆112Updated 7 months ago
- A Python implementation of Lunr.js 🌖☆202Updated 9 months ago
- A fast, comprehensive, ISO 639 library.☆47Updated 4 months ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆122Updated 2 months ago
- Tool to fix bitexts and tag near-duplicates for removal☆34Updated 4 months ago
- Transform TMX to text☆28Updated 3 years ago
- A python module to reduce Unicode to a 'good enough' ASCII representation (outdated Github copy)☆43Updated 14 years ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆74Updated 9 months ago
- Targetted language identifier, based on FastText and Hunspell.☆38Updated 4 months ago
- Parse numbers written in natural language☆124Updated last year
- Faster, modernized fork of the language identification tool langid.py☆61Updated last year
- A python package for grapheme aware string handling☆114Updated 3 years ago
- This is a simple Python package for calculating a variety of lexical diversity indices☆82Updated 2 years ago
- A python true casing utility that restores case information for texts☆88Updated 3 years ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆68Updated 2 years ago
- A module to compute textual lexical richness (aka lexical diversity).☆112Updated 2 years ago