This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/TV shows transcripts, Youtube Video transcripts, Online sources. The corpus has 14, 438 utterances culminating into over 24 hours of speech.
☆38Jul 31, 2025Updated 7 months ago
Alternatives and similar repositories for BembaSpeech
Users that are interested in BembaSpeech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository for multilingual speech data resources for native languages of Zambia.☆20Oct 9, 2024Updated last year
- Scripts to create speech corpora from open.bible☆13Jan 3, 2022Updated 4 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆114Apr 26, 2024Updated last year
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆41Dec 18, 2020Updated 5 years ago
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆23Jan 26, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.☆18Apr 29, 2024Updated last year
- ☆14May 24, 2022Updated 3 years ago
- Dictionary of pairs of Korean word and IPA crawled from Wiktionary (Korean edition)☆23Nov 12, 2025Updated 4 months ago
- An R package for implementing and evaluating Maximum Entropy Optimality Theory models☆10Feb 24, 2026Updated last month
- Creating super-parallel corpora of more than 1500+ unique languages for NLP research☆34Dec 8, 2022Updated 3 years ago
- A repository containing links to useful phonological software☆12Feb 16, 2023Updated 3 years ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 10 months ago
- MasakhaNEWS: News Topic Classification for African Languages☆26May 12, 2024Updated last year
- arxiv daily for speech translation, legal. Ref: Vincentqyw/cv-arxiv-daily☆15Jan 6, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Read in a 'Praat' 'TextGrid' File☆17Oct 28, 2025Updated 5 months ago
- Curate online wolof text resources that can be used to build models☆28Mar 7, 2026Updated 3 weeks ago
- Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.☆12Mar 15, 2025Updated last year
- phone inventory library☆17May 15, 2023Updated 2 years ago
- This repository presents an evaluation framework for speech-to-speech (S2S) models, following the methodology described in the EmphAsses …☆25Jan 9, 2024Updated 2 years ago
- MAFAND-MT☆61Jul 9, 2024Updated last year
- A HTML+CSS+JS game to learn the arabic alphabet.☆10Mar 4, 2023Updated 3 years ago
- Repo & Project for the Imminent Research Grant code & tasks☆12May 20, 2024Updated last year
- Data, Embeddings, Stopword lists, code, and baselines for COLING 2020 paper titled "KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text …☆15Apr 26, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- How to detect language and translate text data into the language of your choice when working on a NLP project☆11Jan 13, 2021Updated 5 years ago
- This repository shows how to efficiently process variable-length sequences in TensorFlow.☆14Apr 26, 2022Updated 3 years ago
- CMU multilingual speech repository☆30Apr 15, 2022Updated 3 years ago
- Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language☆43Feb 28, 2018Updated 8 years ago
- Bunachar Náisiúnta Moirfeolaíochta | Irish National Morphology Database☆27Jun 10, 2024Updated last year
- A corpus of diacritized Hebrew texts (טקסט מנוקד)☆11May 4, 2022Updated 3 years ago
- A set of skills to call your agent Bayes. Thomas Bayes.☆45Mar 18, 2026Updated last week
- This repository includes all the Data Preprocessing required before using a dataset on a Machine Learning Model. Please refer README on h…☆13May 29, 2018Updated 7 years ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆15Mar 26, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆31Sep 6, 2024Updated last year
- ☆11Jul 12, 2021Updated 4 years ago
- ☆11Nov 20, 2019Updated 6 years ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆42Oct 13, 2022Updated 3 years ago
- SyPhon: Constraint-based Learning of Phonological Rules☆11Mar 5, 2025Updated last year
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆14Dec 19, 2022Updated 3 years ago
- Introduction to Random Forest Algorithm for classification problem and how to select important feaatures in your dataset.☆12Aug 1, 2020Updated 5 years ago