tugstugi / mongolian-nlp
Useful resources for Mongolian NLP
☆184Updated 5 months ago
Alternatives and similar repositories for mongolian-nlp
Users that are interested in mongolian-nlp are comparing it to the libraries listed below
Sorting:
- Cyrillic Mongolian text classification with tensorflow 2, and also some fine-tuning on TugsTugi's Mongolian BERT model and other NLP expe…☆32Updated 2 years ago
- Pre-trained Mongolian BERT models☆46Updated 4 years ago
- Mongolian speech recognition with PyTorch☆134Updated 4 years ago
- Generate a 1 million-sample warm-up dataset for neural machine translation from a 700 million-word Mongolian text corpus using the Google…☆18Updated 3 months ago
- The Mongolian Wordnet (MonWN)☆17Updated 3 years ago
- Монгол үгийн алдаа шалгах толь, Mongolian spellchecking dictionary☆83Updated last month
- Pytorch-Named-Entity-Recognition-with-BERT☆15Updated 4 years ago
- SOTA punctation restoration (for e.g. automatic speech recognition) deep learning model based on BERT pre-trained model☆180Updated 6 years ago
- Text to Speech with PyTorch (English and Mongolian)☆185Updated 7 months ago
- 🙊 software for creating speech recognition models.☆159Updated 11 months ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆26Updated 2 years ago
- repo for Tibetan corpora☆21Updated 2 years ago
- 😎 Curated list of Tibetan NLP projects☆37Updated 4 years ago
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆277Updated 3 months ago
- ALBERT trained on Mongolian text corpus☆18Updated 4 years ago
- Universal Romanizer that can convert any unicode script to roman (latin) script☆197Updated 9 months ago
- Sentence aligner☆112Updated 3 years ago
- cLang-8 is a dataset for grammatical error correction.☆104Updated 2 years ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆75Updated last year
- 🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python☆67Updated 2 months ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆45Updated 2 years ago
- An audio and transcribed corpus of contemporary Hong Kong Cantonese☆37Updated 4 years ago
- Data and code for grapheme-to-phoneme transducers in lots of languages☆136Updated last year
- 🦜 NLP for Tibetan, in Python.☆35Updated last year
- Improved Sentence Alignment in Linear Time and Space☆171Updated 2 years ago
- Support tools for punctuation and boundary detection for ASR output.☆57Updated 2 years ago
- Punctuation Restoration using Transformer Models for High-and Low-Resource Languages☆214Updated 9 months ago
- Python package and data files for manipulating phonological segments (phones, phonemes) in terms of universal phonological features.☆257Updated 9 months ago
- Lecture and seminar materials for Deep Learning summer school in Ulaanbaatar, 2019☆12Updated 3 years ago
- ☆58Updated last year