ijdutse / hausa-corpusLinks
A collection of textual datasets in Hausa language and the corresponding translation in English language.
☆15Updated 4 years ago
Alternatives and similar repositories for hausa-corpus
Users that are interested in hausa-corpus are comparing it to the libraries listed below
Sorting:
- Crosslingual Question Answering for African Languages☆31Updated 11 months ago
- MAFAND-MT☆57Updated last year
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆75Updated 3 years ago
- This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, H…☆32Updated last year
- Building an effective preprocessing tool for African languages☆13Updated last year
- Shoonya - Platform to Annotate and label data at scale.☆57Updated 11 months ago
- COMET for African languages☆10Updated 7 months ago
- GenieNLP: A versatile codebase for any NLP task☆89Updated last year
- Domain-Specific Text Generation for Machine Translation (with LLMs) - scripts and config files for the paper☆17Updated 2 years ago
- ☆57Updated 3 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆109Updated last year
- Masakhane Web is a translation web application for solely African Languages.☆37Updated 2 years ago
- Reproducing "Writing with Transformer" demo, using aitextgen/FastAPI in backend, Quill/React in frontend☆28Updated 4 years ago
- Almost state of art text generation library☆66Updated last month
- Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentati…☆41Updated 2 years ago
- An example of multilingual machine translation using a pretrained version of mt5 from Hugging Face.☆42Updated 4 years ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆35Updated 2 years ago
- Documentation effort for the BookCorpus dataset☆34Updated 4 years ago
- Hugging Face and Pyserini interoperability☆20Updated 2 years ago
- CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed dat…☆35Updated 4 years ago
- ☆110Updated last year
- A tiny BERT for low-resource monolingual models☆31Updated 11 months ago
- Translation demonstrator☆34Updated 5 years ago
- MasakhaNEWS: News Topic Classification for African Languages☆24Updated last year
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆48Updated last year
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.☆41Updated 2 years ago
- Generate a SQLite database from Wikipedia & Wikidata dumps.☆34Updated last year
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆43Updated 4 years ago
- Rasa's retail starter pack☆42Updated 2 years ago
- Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/…☆60Updated 4 years ago