sharavsambuu / english-mongolian-nmt-dataset-augmentation
Generate a 1 million-sample warm-up dataset for neural machine translation from a 700 million-word Mongolian text corpus using the Google Translate service
☆18Updated last month
Alternatives and similar repositories for english-mongolian-nmt-dataset-augmentation:
Users that are interested in english-mongolian-nmt-dataset-augmentation are comparing it to the libraries listed below
- Cyrillic Mongolian text classification with tensorflow 2, and also some fine-tuning on TugsTugi's Mongolian BERT model and other NLP expe…☆32Updated 2 years ago
- Pre-trained Mongolian BERT models☆46Updated 4 years ago
- Useful resources for Mongolian NLP☆181Updated 3 months ago
- Mongolian speech recognition with PyTorch☆134Updated 4 years ago
- The Mongolian Wordnet (MonWN)☆17Updated 3 years ago
- Монгол үгийн алдаа шалгах толь, Mongolian spellchecking dictionary☆84Updated 2 weeks ago
- Pytorch-Named-Entity-Recognition-with-BERT☆15Updated 4 years ago
- Lecture and seminar materials for Deep Learning summer school in Ulaanbaatar, 2019☆12Updated 3 years ago
- Text to Speech with PyTorch (English and Mongolian)☆185Updated 6 months ago
- Lecture and seminar materials for Deep Learning summer school in Ulaanbaatar, 2021☆10Updated 3 years ago
- Jupyter notebooks that use the Fastai library☆92Updated 3 years ago
- Use Rasa to build a FAQ bot☆75Updated 2 years ago
- Neural chitchat components for Rasa☆16Updated 2 years ago
- A Keras implementation of a deep learning network to simultaneously perform Word Segmentation and Part-of-Speech (POS) Tagging introduced…☆11Updated 2 years ago
- ALBERT trained on Mongolian text corpus☆18Updated 4 years ago
- Resources and Tool for Bangla language computation☆14Updated last year
- cLang-8 is a dataset for grammatical error correction.☆103Updated 2 years ago
- Question Generation using Google T5 and Text2Text☆154Updated 4 years ago
- Pytorch implementation for paper 'BANNER: A Cost-Sensitive Contextualized Model for Bangla Named Entity Recognition'☆14Updated 4 years ago
- ☆12Updated 9 years ago
- Automatic Context Sensitive Spelling Correction for Bangla Text Using Bert and Levenstein Distance☆20Updated 4 months ago
- ☆43Updated 2 years ago
- State of the Art Language models and Classifier for Bengali, which is primarily spoken by the Bengalis in South Asia.☆32Updated 4 years ago
- Document processing using transformers☆20Updated last year
- SOTA punctation restoration (for e.g. automatic speech recognition) deep learning model based on BERT pre-trained model☆180Updated 5 years ago
- Collection of Urdu datasets for POS, NER, Sentiment, Summarization and NLP tasks.☆73Updated 7 months ago
- ☆109Updated last year
- The English-Vietnamese Bilingual Corpus (EVBCorpus) is a collection of English and Vietnamese parallel translations and bitexts.☆42Updated 5 years ago
- This repository is used to publish our codes for the conference paper "Vietnamese punctuation prediction using deep neural networks" at S…☆10Updated 4 years ago
- My Notes on Tensorflow Dev Summit 2020☆13Updated 5 years ago