tibetan-nlp / classical-tibetan-corpus
Linguistically analyzed Classical Tibetan texts
☆23Updated 3 years ago
Related projects: ⓘ
- 😎 Curated list of Tibetan NLP projects☆33Updated 4 years ago
- 🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python☆58Updated last week
- 🦜 NLP for Tibetan, in Python.☆32Updated last year
- Hunspell files for Tibetan☆20Updated 9 years ago
- ☆49Updated 5 months ago
- ✒️ དག་བྱེད། Dakje, improving your spelling and readability☆11Updated 2 years ago
- repo for Tibetan corpora☆21Updated last year
- Lucene analyzer for Tibetan☆12Updated last week
- simple CSV database if Tibetan verbs☆20Updated 9 years ago
- 😎 Curated list of tibetan canon datasets☆14Updated 4 years ago
- Efficient Low-Memory Aligner☆135Updated 2 weeks ago
- ☆61Updated 4 months ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated last year
- Sentence aligner☆106Updated 3 years ago
- This is a collection of sentence-level aligned Sanskrit-Tibetan Etexts.☆14Updated 2 years ago
- ☆27Updated 4 months ago
- Tibetan Language Processing Library☆18Updated 6 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆23Updated last year
- ☆18Updated 7 years ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆27Updated 3 months ago
- Tibetan to English Machine Translation☆10Updated 3 years ago
- OpusFilter - Parallel corpus processing toolkit☆101Updated last month
- Multilingual sentence alignment using sentence embeddings☆92Updated 9 months ago
- Efficient Markov Chain word alignment☆54Updated 3 years ago
- ☆42Updated 6 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆27Updated 3 years ago
- SegBo: A database of borrowed sounds in the world’s languages☆15Updated 6 months ago
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆21Updated 2 years ago
- Translation Memory Open-source Purifier☆32Updated last year
- An initiative to collect and distribute resources for co-reference resolution in a unified standard.☆23Updated 4 months ago