jacksonllee / iso639Links
ISO 639 language codes
☆44Updated 3 months ago
Alternatives and similar repositories for iso639
Users that are interested in iso639 are comparing it to the libraries listed below
Sorting:
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆73Updated last month
- fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-ha…☆39Updated 2 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆30Updated 4 months ago
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Updated 6 months ago
- phone inventory library☆16Updated 2 years ago
- Python Finite-State Toolkit☆55Updated this week
- A python package to simulate typographical errors.☆35Updated last year
- Multilingual syllable annotation pipeline component for spacy☆39Updated 2 years ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.☆21Updated last year
- A python module for word inflections designed for use with spaCy.☆92Updated 5 years ago
- A file utility for accessing both local and remote files through a unified interface.☆42Updated 3 weeks ago
- Fast and accurate natural language detection. Detector written in Python. Nito-ELD, ELD.☆17Updated last year
- A tiny BERT for low-resource monolingual models☆31Updated 8 months ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 5 years ago
- ISO 639 library for Python☆33Updated 9 months ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆81Updated 8 months ago
- Gamma Agreement in Python☆44Updated last year
- Rust python bindings for symspell☆19Updated last year
- Library for fast text representation and classification.☆28Updated last year
- Code for SaGe subword tokenizer (EACL 2023)☆25Updated 6 months ago
- ☆22Updated 3 years ago
- Unicode Standard tokenization routines and orthography profile segmentation☆37Updated 3 months ago
- A survey of corpora for Germanic low-resource languages and dialects☆25Updated 6 months ago
- python package for calculating famous measures in computational linguistics☆14Updated 7 months ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆107Updated last week
- universal syllabification algorithms☆44Updated 2 years ago
- Cython wrapper on Hunspell Dictionary☆66Updated 11 months ago
- SegEval Segmentation Evaluation Package☆56Updated last year
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- The grobidmonkey package is an open-source package designed for postprocessing GROBID outputs.☆11Updated last year