sharavsambuu / english-mongolian-nmt-dataset-augmentation
Generate a 1 million-sample warm-up dataset for neural machine translation from a 700 million-word Mongolian text corpus using the Google Translate service
☆18Updated 2 months ago
Alternatives and similar repositories for english-mongolian-nmt-dataset-augmentation:
Users that are interested in english-mongolian-nmt-dataset-augmentation are comparing it to the libraries listed below
- Cyrillic Mongolian text classification with tensorflow 2, and also some fine-tuning on TugsTugi's Mongolian BERT model and other NLP expe…☆32Updated 2 years ago
- Pre-trained Mongolian BERT models☆46Updated 4 years ago
- Useful resources for Mongolian NLP☆184Updated 4 months ago
- Mongolian speech recognition with PyTorch☆134Updated 4 years ago
- The Mongolian Wordnet (MonWN)☆17Updated 3 years ago
- Монгол үгийн алдаа шалгах толь, Mongolian spellchecking dictionary☆84Updated 2 weeks ago
- Mongolian automated license plate recognition.☆13Updated 4 years ago
- Text to Speech with PyTorch (English and Mongolian)☆185Updated 7 months ago
- Datasets and tools for basic natural language processing.☆380Updated 3 years ago
- Pytorch-Named-Entity-Recognition-with-BERT☆15Updated 4 years ago
- Lecture and seminar materials for Deep Learning summer school in Ulaanbaatar, 2019☆12Updated 3 years ago
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆36Updated last year
- Newspaper Segmentation into images and text☆12Updated 6 years ago
- Automatically Score essays using Deep Learning☆149Updated 5 years ago
- A simple web interface for building voice assistants with Rasa☆175Updated last year
- ☆29Updated 5 years ago
- ☆138Updated last year
- 4th place solution to Zindi's low-resource automatic speech recognition competition☆8Updated 3 years ago
- ☆8Updated 3 years ago
- Virtual Assistant☆75Updated 2 years ago
- Collection of Urdu datasets for POS, NER, Sentiment, Summarization and NLP tasks.☆73Updated 8 months ago
- Document processing using transformers☆20Updated 2 years ago
- Arabic edition of BERT pretrained language models☆129Updated 4 years ago
- Crowd sourced training data for Rasa NLU models☆201Updated last year
- A simple POS Tagger made using a Bidirectional LSTM using keras trained on the Brown Corpus☆34Updated 6 years ago
- Word segmentation using Conditional Random Fields (CRF) for Khmer document☆29Updated 4 years ago
- indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2☆125Updated last year
- A fast and accurate POS and morphological tagging toolkit (EACL 2014)☆141Updated 5 years ago
- Multilingual Speech Recognition for Indonesian Languages☆62Updated 2 years ago
- Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information e…☆29Updated 4 years ago