jacksonllee / iso639
ISO 639 language codes
☆38Updated 2 months ago
Alternatives and similar repositories for iso639:
Users that are interested in iso639 are comparing it to the libraries listed below
- Rust python bindings for symspell☆18Updated last year
- Tool to fix bitexts and tag near-duplicates for removal☆29Updated 5 months ago
- Fast and accurate natural language detection. Detector written in Python. Nito-ELD, ELD.☆15Updated last year
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆68Updated 2 weeks ago
- Rust-based Python wrapper for duckling library in Haskell☆25Updated 4 years ago
- fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-ha…☆39Updated 2 years ago
- A python package to simulate typographical errors.☆31Updated last year
- Source code for the Apple reproduction☆31Updated 3 years ago
- This is a prototype of a multi-lingual suite for named-entity recognition in Python.☆21Updated 8 months ago
- Python Finite-State Toolkit☆47Updated last week
- A file utility for accessing both local and remote files through a unified interface.☆35Updated this week
- Python tools for interacting with Wikidata☆148Updated last year
- A spaCy custom component that extracts and normalizes temporal expressions☆52Updated last year
- Source code and data for Like a Good Nearest Neighbor☆28Updated last week
- Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American En…☆15Updated 3 years ago
- Resource and Tool for Writing System Identification -- LREC 2024☆13Updated 7 months ago
- Tower Parse: Low-Resource Dependency Parsing via Hierarchical Source Selection☆15Updated 3 years ago
- A flexible sentence segmentation library using CRF model and regex rules☆28Updated 10 months ago
- CMU Linguistic Annotation Backend☆15Updated 9 months ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆65Updated last year
- A python module for word inflections designed for use with spaCy.☆92Updated 4 years ago
- Targetted language identifier, based on FastText and Hunspell.☆33Updated 2 months ago
- Faster, modernized fork of the language identification tool langid.py☆50Updated last month
- MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki☆22Updated last month
- 🧬 A VS Code extension for annotating data with Prodigy☆30Updated 3 years ago
- ISO 639 library for Python☆32Updated 4 months ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆69Updated 8 months ago
- an experimental implementation of Burrow's delta in Python 3☆20Updated 3 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆22Updated last month
- A Python module for retrieving script types of writing systems including alphabets, abjads, abugidas, syllabaries, logographs, featurals …☆12Updated 5 months ago