ijdutse / hausa-corpusLinks
A collection of textual datasets in Hausa language and the corresponding translation in English language.
☆15Updated 4 years ago
Alternatives and similar repositories for hausa-corpus
Users that are interested in hausa-corpus are comparing it to the libraries listed below
Sorting:
- This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, H…☆32Updated last year
- Crosslingual Question Answering for African Languages☆30Updated 9 months ago
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆47Updated last year
- Building an effective preprocessing tool for African languages☆13Updated last year
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆74Updated 3 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆106Updated last year
- Common crawl pretrained sentencepiece tokenizers for English and Japanese for various vocabulary sizes. Also development environment for …☆10Updated 3 years ago
- Almost state of art text generation library☆66Updated last week
- A tiny BERT for low-resource monolingual models☆31Updated 9 months ago
- Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/…☆60Updated 3 years ago
- Tool to take your ML model from local to production with one-line of code.☆25Updated last year
- scipts for working with open.bible data☆24Updated 3 years ago
- Masakhane Web is a translation web application for solely African Languages.☆37Updated last year
- COMET for African languages☆10Updated 5 months ago
- ☆109Updated last year
- Meme search engine built with Jina neural search framework. Search with captions or image files to find matching memes.☆24Updated 3 years ago
- Using short models to classify long texts☆21Updated 2 years ago
- Data, Embeddings, Stopword lists, code, and baselines for COLING 2020 paper titled "KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text …☆13Updated last year
- Audio Preprocessing and finetuning of wav2vec2-large-xlsr model on AI4D Baamtu Datamation - Automatic Speech Recognition in WOLOF Data.☆17Updated 3 years ago
- Predicting what word comes next with Tensorflow.☆10Updated 2 years ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆39Updated 2 years ago
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆10Updated 3 years ago
- Shoonya - Platform to Annotate and label data at scale.☆54Updated 9 months ago
- 🫠 check your data, before you wreck your model☆16Updated 2 years ago
- Pre-trained, multilingual sequence-to-sequence models for Indian languages☆48Updated 2 years ago
- TorchServe+Streamlit for easily serving your HuggingFace NER models☆33Updated 2 years ago
- MasakhaNEWS: News Topic Classification for African Languages☆23Updated last year
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipeline☆32Updated 2 years ago
- ☆20Updated 4 years ago
- A Streamlit application to visualize sentence embeddings☆19Updated 2 years ago