sharavsambuu / english-mongolian-nmt-dataset-augmentation
Generate a 1 million-sample warm-up dataset for neural machine translation from a 700 million-word Mongolian text corpus using the Google Translate service
☆18Updated 3 months ago
Alternatives and similar repositories for english-mongolian-nmt-dataset-augmentation
Users that are interested in english-mongolian-nmt-dataset-augmentation are comparing it to the libraries listed below
Sorting:
- Cyrillic Mongolian text classification with tensorflow 2, and also some fine-tuning on TugsTugi's Mongolian BERT model and other NLP expe…☆32Updated 2 years ago
- Pre-trained Mongolian BERT models☆46Updated 4 years ago
- Useful resources for Mongolian NLP☆184Updated 5 months ago
- Mongolian speech recognition with PyTorch☆134Updated 4 years ago
- The Mongolian Wordnet (MonWN)☆17Updated 3 years ago
- Монгол үгийн алдаа шалгах толь, Mongolian spellchecking dictionary☆83Updated last month
- Pytorch-Named-Entity-Recognition-with-BERT☆15Updated 4 years ago
- Mongolian automated license plate recognition.☆13Updated 4 years ago
- Lecture and seminar materials for Deep Learning summer school in Ulaanbaatar, 2019☆12Updated 3 years ago
- Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/…☆60Updated 3 years ago
- Text and Punctuation correction with Deep Learning☆128Updated 5 years ago
- ☆13Updated 6 years ago
- Use Rasa to build a FAQ bot☆75Updated 2 years ago
- ALBERT trained on Mongolian text corpus☆18Updated 4 years ago
- Text to Speech with PyTorch (English and Mongolian)☆185Updated 7 months ago
- Use Language Model (LM) for Grammar Error Correction (GEC), without the use of annotated data.☆83Updated 5 years ago
- Wikipedia article dataset☆12Updated 6 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆26Updated 2 years ago
- The English-Vietnamese Bilingual Corpus (EVBCorpus) is a collection of English and Vietnamese parallel translations and bitexts.☆42Updated 5 years ago
- BERT Question and Answer system meant and works well for only limited number of words summary like 1 to 2 paragraphs only. It can’t be ab…☆113Updated 4 years ago
- Speech recognition framework using keras☆14Updated 7 years ago
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆71Updated 2 years ago
- Fast and accurate spell correction library☆81Updated 3 years ago
- ☆64Updated last year
- A Python based API to access Indian language WordNets.☆39Updated 3 years ago
- Indian Language Tagger and Chunker (Hindi, Telugu, Tamil, Marathi, Punjabi, Kanada, Malayalam, Urdu, Bengali)☆42Updated 2 years ago
- Sentence aligner☆112Updated 3 years ago
- Code for extracting parallel corpora from pmindia☆16Updated 5 years ago
- A web application that interfaces two GEC systems. [web instance is down]☆31Updated 9 months ago
- Collection of Urdu datasets for POS, NER, Sentiment, Summarization and NLP tasks.☆72Updated 9 months ago