salesforce / speech-datasets
Simplified recipes for preparing commonly used speech datasets, and a PyTorch-compatible Python data loader that can perform standard feature computations & data augmentations.
☆15Updated last year
Alternatives and similar repositories for speech-datasets:
Users that are interested in speech-datasets are comparing it to the libraries listed below
- Fast and differentiable hidden Markov model in C++☆17Updated 2 years ago
- A JAX library for building lattice-based speech transducer models☆45Updated 3 months ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆11Updated 4 years ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆20Updated last year
- Source code for INTERSPEECH2020☆11Updated 4 years ago
- Prosodic Speech Segmentation with Transformers☆25Updated last year
- ☆22Updated 3 years ago
- Proposed splits for the LREC Wikipron paper☆14Updated 4 years ago
- Temporary anonymous version☆22Updated last year
- A library of speech gadgets.☆13Updated 2 years ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]☆24Updated 2 years ago
- ☆12Updated 3 years ago
- Code for the paper: How Much Context Does My Attention-Based ASR System Need?☆11Updated last month
- BERT and LSTM baseline models of the ZeroSpeech Challenge 2021☆58Updated 2 years ago
- ☆42Updated 3 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated last year
- WarpRNNT loss ported in Numba CPU/CUDA for Pytorch☆16Updated 3 years ago
- ☆17Updated last year
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Updated 4 years ago
- Text to Speech Synthesis based on controllable latent representation☆14Updated 5 years ago
- A collection of utilities for handling IPA phones.☆25Updated last year
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆12Updated 8 months ago
- Speech in Flax/JAX☆15Updated 2 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 2 years ago
- The aim of this project is to make voice assistants more responsive towards whisper to some extent.☆10Updated 5 years ago
- Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation☆39Updated 4 years ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.☆12Updated 2 years ago
- GPT for FACodec☆13Updated last year
- PyTorch implementation of simplified neural source filter model (s-nsf)☆14Updated 3 years ago