gauthelo/kallaama-speech-dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gauthelo/kallaama-speech-dataset)

gauthelo / kallaama-speech-dataset

A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.

☆20

Alternatives and similar repositories for kallaama-speech-dataset

Users that are interested in kallaama-speech-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

WolofProcessing / online_wolof_data
View on GitHub
Curate online wolof text resources that can be used to build models
☆28Jun 25, 2026Updated last month
uds-lsv / afro-maft
View on GitHub
☆17Jan 12, 2023Updated 3 years ago
masakhane-io / afriqa
View on GitHub
Crosslingual Question Answering for African Languages
☆31Sep 27, 2024Updated last year
alpoktem / bible2speechDB
View on GitHub
Scripts to create speech corpora from open.bible
☆13Jan 3, 2022Updated 4 years ago
neulab / AfricanVoices
View on GitHub
Hosts text-to-speech corpus and speech synthesizers for African languages.
☆19May 31, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
DerXter / NumMenu-Bot
View on GitHub
An example of a chatbot with a number-based menu that can be used as a starting point for a project.
☆29Apr 24, 2024Updated 2 years ago
masakhane-io / lafand-mt
View on GitHub
MAFAND-MT
☆63Jul 9, 2024Updated 2 years ago
milkymap / pdf2gpt-index
View on GitHub
build gpt-index using chatgpt and sentence-transformers
☆14Apr 8, 2023Updated 3 years ago
Waxal-Multilingual / speech-data
View on GitHub
This repository contains multi-modal speech data for African languages that can be used to train ASR and NLP models
☆19Aug 31, 2022Updated 3 years ago
ylacombe / scripts_and_notebooks
View on GitHub
A list of scripts/notebooks I'd like to keep handy
☆18Aug 15, 2024Updated last year
McGill-NLP / AfroBench
View on GitHub
Large Scale Benchmark of Large Language Models on African Languages
☆21Jul 28, 2025Updated last year
Niger-Volta-LTI / yoruba-voice
View on GitHub
Repo & Project for the Imminent Research Grant code & tasks
☆12May 20, 2024Updated 2 years ago
dadelani / sib-200
View on GitHub
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects
☆26May 20, 2026Updated 2 months ago
Mister-iks / ai_suggest_deployment
View on GitHub
AI SUGGEST is a powerful command-line assistant that leverages AI to provide accurate Linux commands based on natural language queries. S…
☆11Aug 22, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ehsanasgari / 1000Langs
View on GitHub
Creating super-parallel corpora of more than 1500+ unique languages for NLP research
☆33Dec 8, 2022Updated 3 years ago
csikasote / BembaSpeech
View on GitHub
This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/…
☆41Jul 31, 2025Updated 11 months ago
alirezamshi / small100
View on GitHub
Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…
☆30Feb 8, 2023Updated 3 years ago
getalp / ALFFA_PUBLIC
View on GitHub
☆53Dec 3, 2021Updated 4 years ago
masakhane-io / africomet
View on GitHub
COMET for African languages
☆11Jan 24, 2025Updated last year
ARBML / dar
View on GitHub
A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.
☆11Jun 23, 2024Updated 2 years ago
MicrosoftTranslator / NTREX
View on GitHub
NTREX -- News Test References for MT Evaluation
☆87Jun 5, 2024Updated 2 years ago
oya163 / nepali-ner
View on GitHub
Named Entity Recognition in Nepali Language
☆10Jan 12, 2023Updated 3 years ago
abdouaziz / wolof
View on GitHub
Wolof is a library that you can use to do specific tasks in NLP with the Wolof language e.g. text classification in Wolof , NMT , ASR
☆32Nov 28, 2023Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
ekapolc / gowajee_corpus
View on GitHub
Thai smart home corpus with "Gowajee" hotword
☆19Jul 30, 2023Updated 2 years ago
dsridhar91 / hstm
View on GitHub
Code and data for "Heterogeneous Supervised Topic Models"
☆10Jun 27, 2022Updated 4 years ago
csong27 / auditing-text-generation
View on GitHub
Code for Auditing Data Provenance in Text-Generation Models (in KDD 2019)
☆10Jun 18, 2019Updated 7 years ago
multitel-ai / urban-sound-tagging
View on GitHub
1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context
☆17Dec 8, 2022Updated 3 years ago
Patil-Onkar / Remove-silence-from-an-audio
View on GitHub
☆10Jun 30, 2022Updated 4 years ago
TylorShine / MNP-SVC
View on GitHub
Real-time end-to-end singing voice convertion
☆25Nov 3, 2024Updated last year
AsoSoft / AsoSoft-TTS-Speech-Corpus-for-Central-Kurdish
View on GitHub
AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech
☆23Jun 24, 2022Updated 4 years ago
stefan-it / xlm-v-experiments
View on GitHub
Experiments for XLM-V Transformers Integeration
☆13Feb 8, 2023Updated 3 years ago
R1ckShi / FrontEnd-AEC
View on GitHub
Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.
☆19Apr 22, 2019Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
haneul-yoo / HUE
View on GitHub
Hanja Understanding Evaluation Dataset
☆15May 2, 2022Updated 4 years ago
kb-labb / kb_bart
View on GitHub
Pretraining scripts for BART transformer model
☆12May 15, 2023Updated 3 years ago
luoqiaoyang / ACL2021-LaSAML
View on GitHub
The repo for ACL2021 findings paper - Don't Miss the Labels: Label-semantic Argumented Meta-Learner for Few-Shot Text Classification
☆15Mar 24, 2022Updated 4 years ago
mrvoh / meta_learning_multilingual_doc_classification
View on GitHub
Placeholder repository
☆15Mar 16, 2022Updated 4 years ago
dadelani / africanlp-resources
View on GitHub
List of all the resources I developed in collaboration with LSV and Masakhane during my doctoral studies and beyond
☆13Aug 15, 2022Updated 3 years ago
google-research / url-nlp
View on GitHub
☆273Aug 1, 2025Updated 11 months ago
upskyy / Paper-Review
View on GitHub
Paper Review about Speech Recognition · NLP
☆10Mar 25, 2021Updated 5 years ago