This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/TV shows transcripts, Youtube Video transcripts, Online sources. The corpus has 14, 438 utterances culminating into over 24 hours of speech.
☆40Jul 31, 2025Updated 10 months ago
Alternatives and similar repositories for BembaSpeech
Users that are interested in BembaSpeech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository for multilingual speech data resources for native languages of Zambia.☆22Oct 9, 2024Updated last year
- Scripts to create speech corpora from open.bible☆13Jan 3, 2022Updated 4 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆114Apr 26, 2024Updated 2 years ago
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆41Dec 18, 2020Updated 5 years ago
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆24May 20, 2026Updated 3 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.☆19Mar 26, 2026Updated 2 months ago
- Dictionary of pairs of Korean word and IPA crawled from Wiktionary (Korean edition)☆23Nov 12, 2025Updated 7 months ago
- ☆15May 24, 2022Updated 4 years ago
- An R package for implementing and evaluating Maximum Entropy Optimality Theory models☆10May 28, 2026Updated 3 weeks ago
- Creating super-parallel corpora of more than 1500+ unique languages for NLP research☆34Dec 8, 2022Updated 3 years ago
- A repository containing links to useful phonological software☆12Feb 16, 2023Updated 3 years ago
- arxiv daily for speech translation, legal. Ref: Vincentqyw/cv-arxiv-daily☆15Jan 6, 2025Updated last year
- MasakhaNEWS: News Topic Classification for African Languages☆26May 12, 2024Updated 2 years ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Read in a 'Praat' 'TextGrid' File☆17Oct 28, 2025Updated 7 months ago
- Hosts text-to-speech corpus and speech synthesizers for African languages.☆18May 31, 2023Updated 3 years ago
- Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.☆13Mar 15, 2025Updated last year
- phone inventory library☆17May 15, 2023Updated 3 years ago
- scipts for working with open.bible data☆26Jan 24, 2022Updated 4 years ago
- This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …☆25Jan 9, 2024Updated 2 years ago
- MAFAND-MT☆62Jul 9, 2024Updated last year
- Repo & Project for the Imminent Research Grant code & tasks☆12May 20, 2024Updated 2 years ago
- Whisper Speech Quality Assessment (WhiSQA)☆16Apr 14, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Data, Embeddings, Stopword lists, code, and baselines for COLING 2020 paper titled "KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text …☆15Apr 26, 2024Updated 2 years ago
- How to detect language and translate text data into the language of your choice when working on a NLP project☆11Jan 13, 2021Updated 5 years ago
- CMU multilingual speech repository☆30Apr 15, 2022Updated 4 years ago
- Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language☆43Feb 28, 2018Updated 8 years ago
- Bunachar Náisiúnta Moirfeolaíochta | Irish National Morphology Database☆27Jun 10, 2024Updated 2 years ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆16Mar 26, 2022Updated 4 years ago
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆31Sep 6, 2024Updated last year
- ☆12Mar 7, 2022Updated 4 years ago
- ☆11Jul 12, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆45Oct 13, 2022Updated 3 years ago
- A Simple Flask App to interact with your Machine Translation Model☆13Feb 26, 2020Updated 6 years ago
- A streamlit app that creates a web demo of the project: https://github.com/bryandlee/animegan2-pytorch☆12Apr 6, 2022Updated 4 years ago
- SyPhon: Constraint-based Learning of Phonological Rules☆11Mar 5, 2025Updated last year
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆15Dec 19, 2022Updated 3 years ago
- Introduction to Random Forest Algorithm for classification problem and how to select important feaatures in your dataset.☆13Aug 1, 2020Updated 5 years ago
- Hausa-NMT: Empirical Study of Neural Machine translation for English-Hausa-English☆17Oct 20, 2020Updated 5 years ago