csikasote/BembaSpeech

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/csikasote/BembaSpeech)

csikasote / BembaSpeech

This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/TV shows transcripts, Youtube Video transcripts, Online sources. The corpus has 14, 438 utterances culminating into over 24 hours of speech.

☆41

Alternatives and similar repositories for BembaSpeech

Users that are interested in BembaSpeech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

unza-speech-lab / zambezi-voice
View on GitHub
Repository for multilingual speech data resources for native languages of Zambia.
☆22Oct 9, 2024Updated last year
alpoktem / bible2speechDB
View on GitHub
Scripts to create speech corpora from open.bible
☆13Jan 3, 2022Updated 4 years ago
Andrews2017 / africanlp-public-datasets
View on GitHub
A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.
☆117Apr 26, 2024Updated 2 years ago
dadelani / sib-200
View on GitHub
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects
☆26May 20, 2026Updated 2 months ago
desh2608 / pytorch-tdnn
View on GitHub
Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training
☆41Dec 18, 2020Updated 5 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
gauthelo / kallaama-speech-dataset
View on GitHub
A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.
☆20Mar 26, 2026Updated 4 months ago
wonjune-kang / expressive-speech-retrieval
View on GitHub
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
☆15Aug 18, 2025Updated 11 months ago
ellisk42 / bpl_phonology
View on GitHub
☆16May 24, 2022Updated 4 years ago
uniglot / korean-word-ipa-dictionary
View on GitHub
Dictionary of pairs of Korean word and IPA crawled from Wiktionary (Korean edition)
☆23Nov 12, 2025Updated 8 months ago
connormayer / maxent.ot
View on GitHub
An R package for implementing and evaluating Maximum Entropy Optimality Theory models
☆10Updated this week
connormayer / phonological_software
View on GitHub
A repository containing links to useful phonological software
☆12Feb 16, 2023Updated 3 years ago
ehsanasgari / 1000Langs
View on GitHub
Creating super-parallel corpora of more than 1500+ unique languages for NLP research
☆33Dec 8, 2022Updated 3 years ago
masakhane-io / masakhane-news
View on GitHub
MasakhaNEWS: News Topic Classification for African Languages
☆26May 12, 2024Updated 2 years ago
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
tjmahr / readtextgrid
View on GitHub
Read in a 'Praat' 'TextGrid' File
☆17Oct 28, 2025Updated 9 months ago
pacscilab / voxangeles
View on GitHub
VoxAngeles Corpus
☆15Aug 23, 2025Updated 11 months ago
neulab / AfricanVoices
View on GitHub
Hosts text-to-speech corpus and speech synthesizers for African languages.
☆19May 31, 2023Updated 3 years ago
WolofProcessing / online_wolof_data
View on GitHub
Curate online wolof text resources that can be used to build models
☆28Jun 25, 2026Updated last month
WingZLeung / TTDS
View on GitHub
Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.
☆13Mar 15, 2025Updated last year
coqui-ai / open-bible-scripts
View on GitHub
scipts for working with open.bible data
☆26Jan 24, 2022Updated 4 years ago
xinjli / phonepiece
View on GitHub
phone inventory library
☆17May 15, 2023Updated 3 years ago
Niger-Volta-LTI / yoruba-voice
View on GitHub
Repo & Project for the Imminent Research Grant code & tasks
☆12May 20, 2024Updated 2 years ago
LearnNLP / nlp_arxiv_daily
View on GitHub
arxiv daily for speech translation, legal. Ref: Vincentqyw/cv-arxiv-daily
☆15Jan 6, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
masakhane-io / lafand-mt
View on GitHub
MAFAND-MT
☆63Jul 9, 2024Updated 2 years ago
facebookresearch / emphassess
View on GitHub
This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …
☆25Jan 9, 2024Updated 2 years ago
Andrews2017 / KINNEWS-and-KIRNEWS-Corpus
View on GitHub
Data, Embeddings, Stopword lists, code, and baselines for COLING 2020 paper titled "KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text …
☆15Apr 26, 2024Updated 2 years ago
Davisy / Detect-and-Translate-Text-Data
View on GitHub
How to detect language and translate text data into the language of your choice when working on a NLP project
☆11Jan 13, 2021Updated 5 years ago
wavlab-speech / cmu_multilingual_speech
View on GitHub
CMU multilingual speech repository
☆30Apr 15, 2022Updated 4 years ago
carted / handling-variable-length-text-tf
View on GitHub
This repository shows how to efficiently process variable-length sequences in TensorFlow.
☆14Apr 26, 2022Updated 4 years ago
mohamedScikitLearn / Information-retrieval--Text-mining-
View on GitHub
This is a full version on how to creat a search engine using python . Text-minig , TF IDF , Textual data manipulation , Boolean modal ,…
☆14Dec 19, 2018Updated 7 years ago
isheunesutembo / TB-Computer-Aided-Diagnosis-Using-Deep-Learning
View on GitHub
☆11Nov 20, 2019Updated 6 years ago
kevindegila / flask-joey
View on GitHub
A Simple Flask App to interact with your Machine Translation Model
☆13Feb 26, 2020Updated 6 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
rbhatia46 / Data-Preprocessing-Template
View on GitHub
This repository includes all the Data Preprocessing required before using a dataset on a Machine Learning Model. Please refer README on h…
☆13May 29, 2018Updated 8 years ago
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Apr 14, 2026Updated 3 months ago
JackEdTaylor / LexOPS
View on GitHub
An R Package and Shiny App for generating matched stimuli for factiorial-design experiments.
☆29Jan 16, 2025Updated last year
DerXter / State-of-NLP-Research-in-Senegal
View on GitHub
First comprehensive survey of NLP work carried out in Senegalese languages covering various tasks + Applications in the social sciences.
☆30Updated this week
dadelani / menyo-20k_MT
View on GitHub
☆11Jul 12, 2021Updated 5 years ago
asmelashteka / HornMT
View on GitHub
Machine translation (MT) benchmark dataset for languages in the Horn of Africa.
☆46Oct 13, 2022Updated 3 years ago