alpoktem / bible2speechDB
Scripts to create speech corpora from open.bible
☆11Updated 2 years ago
Related projects: ⓘ
- scipts for working with open.bible data☆23Updated 2 years ago
- phone inventory library☆14Updated last year
- Hosts text-to-speech corpus and speech synthesizers for African languages.☆12Updated last year
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated last year
- MaSS - Multilingual corpus of Sentence-aligned Spoken utterances☆48Updated this week
- SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…☆34Updated 3 years ago
- ☆40Updated 2 years ago
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Updated 4 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆41Updated 3 years ago
- This is an ASR corpus for Bemba language. It contains read speech from diverse publicly available Bemba sources; Literature Books, Radio/…☆31Updated 3 months ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆24Updated 3 years ago
- A collection of utilities for handling IPA phones.☆22Updated 11 months ago
- Self-Supervised Speech Pre-training and Representation Learning Toolkit.☆8Updated 2 years ago
- 🎯 Speech Recognition Challenge by Speech Lab - IIT Madras☆11Updated 3 years ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated last year
- A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.☆10Updated 4 months ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 7 months ago
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…☆15Updated last year
- Implementation of the DIVA model of speech acquisition and production using PyTorch☆20Updated last year
- ☆11Updated 2 years ago
- 🫠 check your data, before you wreck your model☆16Updated 2 years ago
- ☆9Updated last year
- Official implementation of the cross-lingual phoneme recognition architectures from "Allophant: Cross-lingual Phoneme Recognition with Ar…☆9Updated last year
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆10Updated 3 years ago
- Unsupervised spoken sentence embeddings☆14Updated last year
- Word Error Rate Estimation☆10Updated 4 years ago
- docker for HF wav2vec2-sprint☆12Updated 3 years ago
- ☆38Updated last year
- An adaptation of Fairseq to (End-to-end) speech translation.☆22Updated 2 years ago
- A guide to building language technology in new languages.☆57Updated 2 years ago