common-voice / cv-dataset
Metadata and versioning details for the Common Voice dataset
☆146Updated last month
Alternatives and similar repositories for cv-dataset:
Users that are interested in cv-dataset are comparing it to the libraries listed below
- Linguistic processing for Common Voice☆55Updated last year
- Various speech datasets made available to the public☆116Updated 4 months ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech☆230Updated 2 years ago
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆415Updated last month
- A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation☆531Updated 2 years ago
- Python server for communicating with Kaldi from the browser using WebRTC☆69Updated last year
- Word alignments generated by the Montreal Forced Aligner for the Librispeech dataset☆162Updated 6 years ago
- Data and code for grapheme-to-phoneme transducers in lots of languages☆135Updated last year
- VCTK multi-speaker tacotron for ICASSP 2020☆266Updated 3 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆335Updated 11 months ago
- Tools for Speech Enhancement integrated with Kaldi☆412Updated last year
- Diarization scoring tools.☆242Updated 2 years ago
- Variational Bayes HMM over x-vectors diarization☆269Updated last year
- Predicts the level of noise and reverberation on your audiofiles☆149Updated 11 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆100Updated 3 months ago
- A tokenizer, text cleaner, and phonemizer for many human languages.☆310Updated 5 months ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆102Updated 2 years ago
- ☆39Updated last year
- UniSpeech - Large Scale Self-Supervised Learning for Speech☆459Updated last year
- Advanced data structures for handling temporal segments with attached labels.☆113Updated 3 months ago
- DeepSpeech based forced alignment tool☆237Updated 4 years ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆73Updated 4 years ago
- Charsiu: A neural phonetic aligner.☆299Updated 2 years ago
- The People’s Speech Dataset☆103Updated last year
- Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.☆205Updated last year
- Multilingual G2P in 100 languages☆322Updated last year
- PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling☆191Updated 3 years ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆162Updated 2 weeks ago
- dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.☆496Updated last year
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆137Updated 4 months ago