asmelashteka / HornMT
Machine translation (MT) benchmark dataset for languages in the Horn of Africa.
☆39Updated 2 years ago
Alternatives and similar repositories for HornMT:
Users that are interested in HornMT are comparing it to the libraries listed below
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆30Updated last year
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆98Updated 8 months ago
- NTREX -- News Test References for MT Evaluation☆80Updated 7 months ago
- These are lists for a variety of languages containing words that are distinctive to each language.☆35Updated 2 years ago
- ☆106Updated last year
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆72Updated last year
- Crosslingual Question Answering for African Languages☆29Updated 3 months ago
- ☆44Updated 2 years ago
- Python Finite-State Toolkit☆47Updated last week
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆37Updated last year
- A repository for the 2022 Inflection Shared Task☆9Updated 2 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆24Updated last year
- ☆14Updated 2 years ago
- Creating super-parallel corpora of more than 1500+ unique languages for NLP research☆32Updated 2 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆29Updated 5 months ago
- A tiny BERT for low-resource monolingual models☆31Updated 3 months ago
- ☆42Updated 3 years ago
- Code for extracting parallel corpora from pmindia☆16Updated 4 years ago
- Transform TMX to text☆29Updated 2 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆27Updated 3 years ago
- ☆19Updated 3 years ago
- Scripts to create speech corpora from open.bible☆12Updated 3 years ago
- A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammars☆15Updated 7 months ago
- Open information and community for machine translation☆72Updated last month
- SIGTYP 2022 Shared Task☆9Updated 2 years ago
- This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/…☆34Updated 3 weeks ago
- Curriculum training☆16Updated this week
- Natural language understanding benchmarks for Norwegian☆14Updated last year
- This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, H…☆31Updated last year
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆47Updated last year