jacksonllee / iso639
ISO 639 language codes
☆44Updated 2 months ago
Alternatives and similar repositories for iso639:
Users that are interested in iso639 are comparing it to the libraries listed below
- Tool to fix bitexts and tag near-duplicates for removal☆30Updated 3 months ago
- Gamma Agreement in Python☆43Updated last year
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Updated 5 months ago
- Python Finite-State Toolkit☆54Updated 2 months ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆106Updated last week
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆71Updated last week
- Unicode Standard tokenization routines and orthography profile segmentation☆37Updated 2 months ago
- Curriculum training☆17Updated 2 months ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated last month
- ☆14Updated 2 years ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 5 years ago
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- A survey of corpora for Germanic low-resource languages and dialects☆25Updated 5 months ago
- A flexible sentence segmentation library using CRF model and regex rules☆29Updated last year
- ☆22Updated 3 years ago
- phone inventory library☆16Updated last year
- Cython wrapper on Hunspell Dictionary☆66Updated 10 months ago
- MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki☆23Updated 2 months ago
- python package for calculating famous measures in computational linguistics☆13Updated 6 months ago
- Rust-based Python wrapper for duckling library in Haskell☆25Updated 4 years ago
- A python package to simulate typographical errors.☆34Updated last year
- A library for data streaming and augmentation☆20Updated this week
- Align the token outputs from Spacy and Huggingface to help understand what language structures transformers see☆44Updated 2 years ago
- an experimental implementation of Burrow's delta in Python 3☆21Updated 3 years ago
- Python module for syllabifying English ARPABET transcriptions☆66Updated 6 years ago
- Featurize words into orthographic and phonological vectors.☆40Updated last year
- ☆72Updated last month
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆51Updated last week
- Suite for phonetic word embeddings, especially their evaluation and baseline models.☆28Updated 2 months ago
- Multilingual Open Text☆25Updated this week