Command line tool to create corpora for Common Voice
☆78Feb 16, 2026Updated 2 weeks ago
Alternatives and similar repositories for CorporaCreator
Users that are interested in CorporaCreator are comparing it to the libraries listed below
Sorting:
- Script for bundling Common Voice (https://commonvoice.mozilla.org/) clips by language☆11Apr 13, 2023Updated 2 years ago
- Linguistic processing for Common Voice☆58Jan 18, 2024Updated 2 years ago
- ☆13Nov 16, 2022Updated 3 years ago
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- Tool for creation, manipulation and maintenance of voice corpora☆82May 3, 2024Updated last year
- Tool to collect and review sentences for Common Voice☆82May 10, 2023Updated 2 years ago
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Jun 19, 2023Updated 2 years ago
- DeepSpeech based forced alignment tool☆239Dec 12, 2020Updated 5 years ago
- Filtering and Noise Adding Tool☆29May 27, 2022Updated 3 years ago
- A voice driven 3D chess game for learning Voice AI☆17Jul 6, 2022Updated 3 years ago
- Properly handle position-dependent phones in a subword lexicon FST☆31Oct 26, 2020Updated 5 years ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆33Oct 23, 2025Updated 4 months ago
- A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone …☆43Aug 3, 2022Updated 3 years ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- This repository contains publicly available speech and text data in Luganda.☆12Sep 4, 2020Updated 5 years ago
- A python implementation of the neural network joint language model and an extension of it using global source context.☆11May 17, 2017Updated 8 years ago
- Tensorflow and kaldi implementation of our paper "VAE-based regularization for deep speaker embedding"☆11Mar 24, 2023Updated 2 years ago
- Similarity Learning applied to Speaker Verification and Semantic Textual Similarity☆12Apr 8, 2020Updated 5 years ago
- A simple pyaudio microphone interface☆11Jul 27, 2018Updated 7 years ago
- Scraping Wikipedia for fair use sentences☆54Jan 25, 2024Updated 2 years ago
- Denoising autoencoders for speaker identification on MCE 2018 challenge☆12Nov 8, 2018Updated 7 years ago
- Mozilla's Community Support Software☆19Mar 29, 2021Updated 4 years ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆11May 14, 2025Updated 9 months ago
- A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …☆16Sep 5, 2017Updated 8 years ago
- CS224S Course Project☆14Jun 9, 2014Updated 11 years ago
- Coqui STT (🐸STT) based forced alignment tool☆13Feb 24, 2022Updated 4 years ago
- Support tools for punctuation and boundary detection for ASR output.☆55Dec 8, 2022Updated 3 years ago
- Code for paper "Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion"☆36Jan 15, 2020Updated 6 years ago
- A living document for all things Common Voice.☆14Jun 24, 2024Updated last year
- Agent toolkit for 100 hours of speech and 10 GiB of text☆14Jul 15, 2025Updated 7 months ago
- ☆14Oct 11, 2024Updated last year
- ☆15Oct 11, 2019Updated 6 years ago
- Demo page of our paper Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks With Guided Attention, ICASSP 201…☆15May 30, 2021Updated 4 years ago
- SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…☆36Apr 25, 2025Updated 10 months ago
- Coqui Inference Engine☆40Aug 3, 2021Updated 4 years ago
- python script for voice activity detection.☆36Aug 16, 2024Updated last year
- Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS☆40Aug 4, 2023Updated 2 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆14Jan 24, 2017Updated 9 years ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Mar 6, 2023Updated 3 years ago