This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/TV shows transcripts, Youtube Video transcripts, Online sources. The corpus has 14, 438 utterances culminating into over 24 hours of speech.
☆39Jul 31, 2025Updated 9 months ago
Alternatives and similar repositories for BembaSpeech
Users that are interested in BembaSpeech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository for multilingual speech data resources for native languages of Zambia.☆21Oct 9, 2024Updated last year
- Scripts to create speech corpora from open.bible☆13Jan 3, 2022Updated 4 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆114Apr 26, 2024Updated 2 years ago
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆41Dec 18, 2020Updated 5 years ago
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆24May 20, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.☆19Mar 26, 2026Updated 2 months ago
- Dictionary of pairs of Korean word and IPA crawled from Wiktionary (Korean edition)☆23Nov 12, 2025Updated 6 months ago
- ☆15May 24, 2022Updated 4 years ago
- An R package for implementing and evaluating Maximum Entropy Optimality Theory models☆10May 14, 2026Updated 2 weeks ago
- Creating super-parallel corpora of more than 1500+ unique languages for NLP research☆34Dec 8, 2022Updated 3 years ago
- A repository containing links to useful phonological software☆12Feb 16, 2023Updated 3 years ago
- MasakhaNEWS: News Topic Classification for African Languages☆26May 12, 2024Updated 2 years ago
- arxiv daily for speech translation, legal. Ref: Vincentqyw/cv-arxiv-daily☆15Jan 6, 2025Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Curate online wolof text resources that can be used to build models☆28May 11, 2026Updated 2 weeks ago
- Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.☆12Mar 15, 2025Updated last year
- phone inventory library☆17May 15, 2023Updated 3 years ago
- scipts for working with open.bible data☆26Jan 24, 2022Updated 4 years ago
- This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …☆25Jan 9, 2024Updated 2 years ago
- MAFAND-MT☆62Jul 9, 2024Updated last year
- Repo & Project for the Imminent Research Grant code & tasks☆12May 20, 2024Updated 2 years ago
- Whisper Speech Quality Assessment (WhiSQA)☆16Apr 14, 2026Updated last month
- Data, Embeddings, Stopword lists, code, and baselines for COLING 2020 paper titled "KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text …☆15Apr 26, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- How to detect language and translate text data into the language of your choice when working on a NLP project☆11Jan 13, 2021Updated 5 years ago
- CMU multilingual speech repository☆30Apr 15, 2022Updated 4 years ago
- Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language☆43Feb 28, 2018Updated 8 years ago
- Bunachar Náisiúnta Moirfeolaíochta | Irish National Morphology Database☆27Jun 10, 2024Updated last year
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆16Mar 26, 2022Updated 4 years ago
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆31Sep 6, 2024Updated last year
- ☆12Mar 7, 2022Updated 4 years ago
- ☆11Jul 12, 2021Updated 4 years ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆45Oct 13, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A Simple Flask App to interact with your Machine Translation Model☆13Feb 26, 2020Updated 6 years ago
- A streamlit app that creates a web demo of the project: https://github.com/bryandlee/animegan2-pytorch☆12Apr 6, 2022Updated 4 years ago
- SyPhon: Constraint-based Learning of Phonological Rules☆11Mar 5, 2025Updated last year
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆15Dec 19, 2022Updated 3 years ago
- Introduction to Random Forest Algorithm for classification problem and how to select important feaatures in your dataset.☆13Aug 1, 2020Updated 5 years ago
- Grapheme to phoneme converter for Estonian☆14May 27, 2021Updated 5 years ago
- asr2k☆52Jun 2, 2024Updated last year