A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.
☆48Oct 15, 2021Updated 4 years ago
Alternatives and similar repositories for data_augmentation_for_asr
Users that are interested in data_augmentation_for_asr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Improving Recording Device Generalization using Impulse Response Augmentation☆20Apr 24, 2025Updated last year
- A lightweight audio codec based on a single quantizer☆34Sep 4, 2025Updated 8 months ago
- Benchmarking different VAD models on AVA-Speech dataset☆18May 21, 2023Updated 3 years ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 3 years ago
- golang vad (voice activity detection) library based on webrtc☆12Dec 13, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- semantic tokenizer for speech and music☆21Jul 6, 2025Updated 10 months ago
- SoTA open-source TTS☆23Jun 17, 2025Updated 11 months ago
- Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"☆26Mar 27, 2024Updated 2 years ago
- Implementation of Sheffield entry for Clarity enhancement challenge.☆18Apr 19, 2022Updated 4 years ago
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆20Feb 9, 2025Updated last year
- Personalized Item Exploration Processes for Recommendation☆15Sep 19, 2019Updated 6 years ago
- English to French and Chinese to French .json dictionaries for Synthesizer V☆15Feb 1, 2023Updated 3 years ago
- A library for speech data augmentation in time-domain☆687Aug 30, 2021Updated 4 years ago
- 🗣️ Convert between phonetic alphabets☆11Feb 7, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.☆102May 24, 2023Updated 3 years ago
- ☆21Sep 24, 2018Updated 7 years ago
- Real-Time De-noising and De-reverbing with Tiny Recurrent UNet☆56Jun 7, 2023Updated 2 years ago
- ☆15Aug 25, 2022Updated 3 years ago
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- Discogs-VI dataset and code☆21Dec 13, 2024Updated last year
- Personalized AEC☆19Nov 3, 2022Updated 3 years ago
- ☆17Apr 3, 2022Updated 4 years ago
- Reproduction of a paper"Small-footprint keyword spotting using deep neural networks"☆12Mar 11, 2019Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- PyTorch implementation of Listen, Attend and Spell (LAS) speech recognition paper☆12Mar 4, 2022Updated 4 years ago
- Unofficial Tensorflow/Keras implementation of Google AI VoiceFilter☆16Mar 25, 2023Updated 3 years ago
- Neural network density models for speech separation.☆20Nov 26, 2020Updated 5 years ago
- VoxLingua107 recipe for SpeechBrain☆13Jul 3, 2021Updated 4 years ago
- A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification☆34Jun 25, 2021Updated 4 years ago
- Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡☆11Jan 23, 2025Updated last year
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated 2 years ago
- A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.☆15May 19, 2020Updated 6 years ago
- [InterSpeech'2023] "Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion"☆13Mar 14, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This repository contains the code for the paper "Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection fr…☆12Dec 19, 2025Updated 5 months ago
- Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)☆21Sep 27, 2022Updated 3 years ago
- One command to build TLG.fst for WeNet.☆30Oct 11, 2022Updated 3 years ago
- MANNER: Multi-view Attention Network for Noise ERasure (Speech enhancement in time-domain)☆65Aug 29, 2022Updated 3 years ago
- ☆10Oct 25, 2019Updated 6 years ago
- Source code for AAAI 22 paper: Hybrid Neural Networks for On-Device Directional Hearing☆19Apr 10, 2024Updated 2 years ago
- Official code for "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"☆159Mar 3, 2026Updated 2 months ago