Esukhia / bophono
Tibetan phonetics engine in Python
☆17Updated 6 months ago
Alternatives and similar repositories for bophono
Users that are interested in bophono are comparing it to the libraries listed below
Sorting:
- 🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python☆67Updated 2 months ago
- Resources for spell checking Tibetan☆13Updated 4 years ago
- Tibetan Unicode to Wylie converter. (EWTS-Extended Wylie Transliteration Scheme)☆25Updated this week
- repo for Tibetan corpora☆21Updated 2 years ago
- Linguistically analyzed Classical Tibetan texts☆26Updated 3 years ago
- all of tibetan dictionary.ཚོང་ལས་ལས་དོན་དུ་སྤྱོད་མི་ཆོག གལ་སྲིད་འགལ་ན་ཁྲིམས་རྩོད་བྱུང་ངེས།☆14Updated last year
- Lucene analyzer for Tibetan☆12Updated this week
- 😎 Curated list of Tibetan NLP projects☆37Updated 4 years ago
- ✒️ དག་བྱེད། Dakje, improving your spelling and readability☆11Updated 2 years ago
- 😎 Curated list of tibetan canon datasets☆17Updated 5 years ago
- Collation algorithm for Tibetan☆10Updated 9 years ago
- ☆56Updated 4 months ago
- simple CSV database if Tibetan verbs☆22Updated 9 years ago
- Python interface to ISLEX, an English IPA pronunciation dictionary with syllable and stress marking.☆52Updated last year
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- Multilingual sentence alignment using sentence embeddings☆117Updated 6 months ago
- Library for extracting text and timestamps from multiple subtitle files (.ass, .ssa, .srt, .sub, .txt).☆52Updated last year
- Aligned bilingual word vectors for English and Chinese☆11Updated 6 years ago
- British English pronunciation dictionary☆95Updated 7 years ago
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆13Updated last year
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text☆36Updated 4 years ago
- This converter converts multiple Uyghur scripts: ULS(Uyghur Latin Script), UAS(Uyghur Arabic Script), CTS(Common Turkick Scritp), UCS(Uyg…☆48Updated 7 months ago
- Public repository of open access Tibetan fonts☆18Updated last week
- An audio and transcribed corpus of contemporary Hong Kong Cantonese☆37Updated 4 years ago
- 📈 A forced aligner intended for synchronization of narrated text☆93Updated 2 years ago
- Convert epub file to txt☆36Updated last year
- Unicode Standard tokenization routines and orthography profile segmentation☆37Updated 2 months ago
- Urdu Word Segmentation using Conditional Random Fields (CRFs)☆12Updated 6 years ago
- An even smaller speech recognizer / force aligner☆32Updated 4 months ago
- Code for our paper in ACL 2017☆13Updated 7 years ago