salesforce / speech-datasetsLinks
Simplified recipes for preparing commonly used speech datasets, and a PyTorch-compatible Python data loader that can perform standard feature computations & data augmentations.
☆15Updated last year
Alternatives and similar repositories for speech-datasets
Users that are interested in speech-datasets are comparing it to the libraries listed below
Sorting:
- Fast and differentiable hidden Markov model in C++☆17Updated 2 years ago
- ☆56Updated 2 years ago
- Source code for INTERSPEECH2020☆11Updated 4 years ago
- ☆17Updated 2 years ago
- logWMSE, an audio quality metric with support for digital silence target. Useful for evaluating audio source separation systems, even whe…☆35Updated 8 months ago
- ☆32Updated 3 years ago
- Code for the paper: How Much Context Does My Attention-Based ASR System Need?☆10Updated 3 weeks ago
- ☆42Updated 3 years ago
- A home for audio ML in JAX. Has common features, learnable frontends, pretrained supervised and self-supervised models.☆68Updated 2 years ago
- A JAX library for building lattice-based speech transducer models☆45Updated 5 months ago
- Unsupervised Speech Decomposition via Triple Information Bottleneck☆14Updated 5 years ago
- Efficient Speech Processing Tookit for Automatic Speaker Recognition☆17Updated 2 years ago
- ☆22Updated 3 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated 2 years ago
- Text to Speech Synthesis based on controllable latent representation☆14Updated 5 years ago
- A collection of utilities for handling IPA phones.☆25Updated last year
- Pytorch Implementation of WaveNODE☆64Updated 4 years ago
- An implement of SPEECHSPLIT☆15Updated 4 years ago
- Speech in Flax/JAX☆15Updated 2 years ago
- Rich Prosody Diversity Modelling with Phone-level Mixture Density Network☆45Updated 3 years ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]☆24Updated 2 years ago
- Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model☆32Updated last year
- Learning and controlling the source-filter representation of speech with a variational autoencoder☆45Updated 2 years ago
- BERT and LSTM baseline models of the ZeroSpeech Challenge 2021☆60Updated 2 years ago
- Torch implementation of Whisper-guided DDPM based Voice Conversion☆49Updated 2 years ago
- Data processing tools for preparing speech and labels for training TTS voices☆27Updated 4 years ago
- UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation☆74Updated 3 years ago
- Tensorflow implementation of DiffWave: A Versatile Diffusion Model for Audio Synthesis☆41Updated 4 years ago
- Speech Parameter Estimation Using Differentiable Speech Synthesizer☆43Updated 2 years ago
- Audio samples accompanying publications related to DF-Conformer, a speech enhancement model.☆29Updated 2 weeks ago