This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/TV shows transcripts, Youtube Video transcripts, Online sources. The corpus has 14, 438 utterances culminating into over 24 hours of speech.
☆39Jul 31, 2025Updated 9 months ago
Alternatives and similar repositories for BembaSpeech
Users that are interested in BembaSpeech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository for multilingual speech data resources for native languages of Zambia.☆20Oct 9, 2024Updated last year
- Scripts to create speech corpora from open.bible☆13Jan 3, 2022Updated 4 years ago
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆41Dec 18, 2020Updated 5 years ago
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆23Jan 26, 2025Updated last year
- A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.☆19Mar 26, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Dictionary of pairs of Korean word and IPA crawled from Wiktionary (Korean edition)☆23Nov 12, 2025Updated 5 months ago
- ☆15May 24, 2022Updated 3 years ago
- An R package for implementing and evaluating Maximum Entropy Optimality Theory models☆10Apr 30, 2026Updated last week
- Creating super-parallel corpora of more than 1500+ unique languages for NLP research☆34Dec 8, 2022Updated 3 years ago
- A repository containing links to useful phonological software☆12Feb 16, 2023Updated 3 years ago
- MasakhaNEWS: News Topic Classification for African Languages☆26May 12, 2024Updated last year
- arxiv daily for speech translation, legal. Ref: Vincentqyw/cv-arxiv-daily☆15Jan 6, 2025Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 11 months ago
- Read in a 'Praat' 'TextGrid' File☆17Oct 28, 2025Updated 6 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Curate online wolof text resources that can be used to build models☆28Mar 7, 2026Updated 2 months ago
- scipts for working with open.bible data☆26Jan 24, 2022Updated 4 years ago
- Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.☆12Mar 15, 2025Updated last year
- phone inventory library☆17May 15, 2023Updated 2 years ago
- MAFAND-MT☆62Jul 9, 2024Updated last year
- This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …☆25Jan 9, 2024Updated 2 years ago
- A HTML+CSS+JS game to learn the arabic alphabet.☆10Mar 4, 2023Updated 3 years ago
- Repo & Project for the Imminent Research Grant code & tasks☆12May 20, 2024Updated last year
- Whisper Speech Quality Assessment (WhiSQA)☆16Apr 14, 2026Updated 3 weeks ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Working towards a free acoustic model for the automatic recognition of New Zealand English☆19Aug 17, 2012Updated 13 years ago
- How to detect language and translate text data into the language of your choice when working on a NLP project☆11Jan 13, 2021Updated 5 years ago
- This repository shows how to efficiently process variable-length sequences in TensorFlow.☆14Apr 26, 2022Updated 4 years ago
- CMU multilingual speech repository☆30Apr 15, 2022Updated 4 years ago
- Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language☆43Feb 28, 2018Updated 8 years ago
- Bunachar Náisiúnta Moirfeolaíochta | Irish National Morphology Database☆27Jun 10, 2024Updated last year
- This repository includes all the Data Preprocessing required before using a dataset on a Machine Learning Model. Please refer README on h…☆13May 29, 2018Updated 7 years ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆15Mar 26, 2022Updated 4 years ago
- ☆12Mar 7, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆31Sep 6, 2024Updated last year
- ☆11Jul 12, 2021Updated 4 years ago
- ☆11Nov 20, 2019Updated 6 years ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆45Oct 13, 2022Updated 3 years ago
- SyPhon: Constraint-based Learning of Phonological Rules☆11Mar 5, 2025Updated last year
- A streamlit app that creates a web demo of the project: https://github.com/bryandlee/animegan2-pytorch☆12Apr 6, 2022Updated 4 years ago
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆15Dec 19, 2022Updated 3 years ago