sharavsambuu / mongolian-text-classificationLinks
Cyrillic Mongolian text classification with tensorflow 2, and also some fine-tuning on TugsTugi's Mongolian BERT model and other NLP experiments are included.
☆32Updated 2 years ago
Alternatives and similar repositories for mongolian-text-classification
Users that are interested in mongolian-text-classification are comparing it to the libraries listed below
Sorting:
- Generate a 1 million-sample warm-up dataset for neural machine translation from a 700 million-word Mongolian text corpus using the Google…☆18Updated 4 months ago
- Useful resources for Mongolian NLP☆184Updated 5 months ago
- Pre-trained Mongolian BERT models☆46Updated 4 years ago
- Mongolian speech recognition with PyTorch☆134Updated 4 years ago
- Pytorch-Named-Entity-Recognition-with-BERT☆15Updated 4 years ago
- The Mongolian Wordnet (MonWN)☆17Updated 3 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆26Updated 2 years ago
- Arabic Dialect Identification on AOC data.☆24Updated 6 years ago
- Монгол үгийн алдаа шалгах толь, Mongolian spellchecking dictionary☆85Updated 2 months ago
- Arabic edition of BERT pretrained language models☆129Updated 4 years ago
- ArSarcasm-v2 is an extension to the original ArSarcasm dataset. It was used for the shared task on sarcasm detection and sentiment analys…☆11Updated 3 years ago
- The first Dialectal Arabic Code Switching - DACS corpus from broadcast speech. Annotated at the token-level, considering both the linguis…☆14Updated 3 years ago
- This repository contains the Arabic sarcasm dataset (ArSarcasm)☆24Updated 4 years ago
- Machine Translation Web Interface for OpenNMT-py☆25Updated 3 years ago
- A small python script that transliterates Arabic text using the Buckwalter Transliteration Scheme. It allows for multiple decisions to be…☆26Updated 11 years ago
- cLang-8 is a dataset for grammatical error correction.☆106Updated 2 years ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated 2 months ago
- ☆120Updated 4 years ago
- Benchmark Arabic text diacritization dataset☆75Updated 5 years ago
- Improved Sentence Alignment in Linear Time and Space☆173Updated 2 years ago
- Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.☆41Updated last year
- NTREX -- News Test References for MT Evaluation☆83Updated last year
- Arabic support for textblob☆85Updated 3 years ago
- Sentence aligner☆113Updated 4 years ago
- Python transliteration library (mostly from non-latin scripts, such as Arabic, Japanese, etc.)☆20Updated 6 years ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆39Updated 2 years ago
- ☆23Updated 4 years ago
- A Python implementation of Farasa toolkit☆129Updated 8 months ago
- ☆40Updated last month
- Arabic speech recognition and dialect identification (Red Hen Lab - GSoC 2018)☆17Updated 4 years ago