SunbirdAI / salt-data-archiveLinks
Multi-way parallel text corpus of 5 key Ugandan languages.
☆17Updated last year
Alternatives and similar repositories for salt-data-archive
Users that are interested in salt-data-archive are comparing it to the libraries listed below
Sorting:
- Machine Translation for Africa☆297Updated 3 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆112Updated last year
- All our community docs! Start here! Lets put Africa on the NLP Map☆62Updated last year
- Masakhane Web is a translation web application for solely African Languages.☆37Updated 2 years ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆40Updated 3 years ago
- A large scale Sanskrit-English translation dataset☆73Updated 2 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆32Updated 7 months ago
- The largest public catalogue for Arabic NLP and speech datasets. There are +500 datasets annotated with more than 25 attributes.☆184Updated last week
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆77Updated 3 years ago
- DziriBERT: a Pre-trained Language Model for the Algerian Dialect☆166Updated 2 years ago
- Arabic edition of BERT pretrained language models☆132Updated 4 years ago
- A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.☆498Updated 2 weeks ago
- ☆114Updated 3 weeks ago
- English to Twi translation system being put together by the GhanaNLP team☆35Updated 4 months ago
- Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.☆540Updated last month
- Arabic support for textblob☆86Updated 4 years ago
- ☆21Updated 3 years ago
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Updated last year
- Yorùbá language training text for NLP, ASR and TTS tasks☆81Updated 2 years ago
- A Python implementation of Farasa toolkit☆136Updated last month
- Pre-trained Nordic models for BERT☆174Updated 3 years ago
- pyarabic☆470Updated last year
- UBC ARBERT and MARBERT Deep Bidirectional Transformers for Arabic☆111Updated 4 years ago
- Arabic NLP tool used to perform Text Search, POS tagging, Translation, auto-diacritization, etc..☆90Updated 4 years ago
- Arabic Tokenization Library. It provides many tokenization algorithms.☆109Updated last year
- Facebook Low Resource (FLoRes) MT Benchmark☆754Updated last year
- The Dakshina dataset is a collection of text in both Latin and native scripts for 12 South Asian languages. For each language, the datase…☆201Updated 5 years ago
- BRAD: Books Reviews in Arabic Dataset☆15Updated 7 years ago
- Pre-processing and training scripts for the Tarteel Dataset☆212Updated 3 years ago
- A collection of paper implementations using the PyTorch framework☆29Updated 4 years ago