salesforce / speech-datasets
Simplified recipes for preparing commonly used speech datasets, and a PyTorch-compatible Python data loader that can perform standard feature computations & data augmentations.
☆15Updated last year
Alternatives and similar repositories for speech-datasets:
Users that are interested in speech-datasets are comparing it to the libraries listed below
- A JAX library for building lattice-based speech transducer models☆45Updated 4 months ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- ☆22Updated 3 years ago
- Prosodic Speech Segmentation with Transformers☆25Updated last year
- ☆17Updated 2 years ago
- Source code for INTERSPEECH2020☆11Updated 4 years ago
- 🏥 🎤 The largest clinical study in the world to collect voice data labeled with health information (N>6,000 participants, 48 utterances…☆28Updated last month
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆14Updated 2 months ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]☆24Updated 2 years ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆22Updated last year
- Code for the paper: How Much Context Does My Attention-Based ASR System Need?☆11Updated this week
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated last year
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆12Updated 9 months ago
- Text to Speech Synthesis based on controllable latent representation☆14Updated 5 years ago
- ☆20Updated 6 years ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆12Updated 7 months ago
- ☆56Updated 2 years ago
- Implementation of the DIVA model of speech acquisition and production using PyTorch☆21Updated 2 years ago
- ☆19Updated 2 years ago
- Fast and differentiable hidden Markov model in C++☆17Updated 2 years ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆22Updated last month
- Implementation of Google's USM speech model in Pytorch☆31Updated last month
- A collection of utilities for handling IPA phones.☆25Updated last year
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 3 years ago
- Speech in Flax/JAX☆15Updated 2 years ago
- Temporary anonymous version☆22Updated last year
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Updated 10 months ago
- Data processing tools for preparing speech and labels for training TTS voices☆26Updated 4 years ago
- The aim of this project is to make voice assistants more responsive towards whisper to some extent.☆10Updated 5 years ago
- Finetuning VITS Efficiently☆32Updated last year