ijdutse / hausa-corpus
A collection of textual datasets in Hausa language and the corresponding translation in English language.
☆15Updated 4 years ago
Alternatives and similar repositories for hausa-corpus:
Users that are interested in hausa-corpus are comparing it to the libraries listed below
- Crosslingual Question Answering for African Languages☆29Updated 6 months ago
- MasakhaNEWS: News Topic Classification for African Languages☆23Updated 11 months ago
- This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, H…☆32Updated last year
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆73Updated 2 years ago
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆48Updated last year
- Building an effective preprocessing tool for African languages☆12Updated last year
- Shoonya - Platform to Annotate and label data at scale.☆54Updated 7 months ago
- Almost state of art text generation library☆66Updated 5 months ago
- ☆110Updated last year
- This repo contains 3 hours of audio speech recordings in Yoruba language collected for research purposes.☆16Updated 4 years ago
- COMET for African languages☆10Updated 3 months ago
- Masakhane Web is a translation web application for solely African Languages.☆36Updated last year
- Bilingual sentence similarity classifier using Tensorflow☆21Updated 5 years ago
- Tool to take your ML model from local to production with one-line of code.☆25Updated last year
- MAFAND-MT☆55Updated 9 months ago
- Fast model deployment on AWS Lambda☆14Updated last year
- All my experiments with the various transformers and various transformer frameworks available☆14Updated 3 years ago
- Predicting what word comes next with Tensorflow.☆10Updated 2 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆31Updated 6 months ago
- A python package to augment text data using NLP.☆40Updated 2 months ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆11Updated last month
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.☆39Updated last year
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️☆36Updated 2 years ago
- Experiments for XLM-V Transformers Integeration☆13Updated 2 years ago
- Pre-trained, multilingual sequence-to-sequence models for Indian languages☆46Updated 2 years ago
- Dataiku DSS plugin to detect languages, correct misspellings, and clean text data 🧼☆22Updated 3 months ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆39Updated 2 years ago
- Scripts to convert datasets from various sources to Hugging Face Datasets.☆57Updated 2 years ago
- A Streamlit app to add structured tags to a dataset card☆22Updated 2 years ago