jacksonllee / iso639Links
ISO 639 language codes
☆45Updated 4 months ago
Alternatives and similar repositories for iso639
Users that are interested in iso639 are comparing it to the libraries listed below
Sorting:
- fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-ha…☆39Updated 2 years ago
- Python Finite-State Toolkit☆56Updated 3 weeks ago
- Tool to fix bitexts and tag near-duplicates for removal☆30Updated 5 months ago
- A flexible sentence segmentation library using CRF model and regex rules☆29Updated last year
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆107Updated last month
- Fast and accurate natural language detection. Detector written in Python. Nito-ELD, ELD.☆17Updated last year
- A Python library for working with and comparing language codes.☆345Updated 2 months ago
- Gamma Agreement in Python☆44Updated last year
- ☆170Updated 3 months ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated 3 months ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆65Updated 3 weeks ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆51Updated last week
- Rust python bindings for symspell☆19Updated last year
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆73Updated 2 weeks ago
- Suite for phonetic word embeddings, especially their evaluation and baseline models.☆30Updated 4 months ago
- Multilingual syllable annotation pipeline component for spacy☆39Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆31Updated 9 months ago
- A python package to simulate typographical errors.☆35Updated last year
- MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki☆23Updated 2 weeks ago
- A Streamlit component for annotating text by text selecting.☆40Updated last year
- Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…☆27Updated 2 years ago
- Faster, modernized fork of the language identification tool langid.py☆56Updated 7 months ago
- ☆74Updated 3 months ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 6 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆166Updated last month
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆82Updated 10 months ago
- A python true casing utility that restores case information for texts☆89Updated 2 years ago
- Fast edit distance Python extension written in Cython/C++. Supports Levenshtein distance and Damerau Optimal String Alignment (OSA) dista…☆24Updated last month
- Accurately find/replace/remove emojis in text strings☆163Updated last year
- ISO 639 library for Python☆33Updated 10 months ago