A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.
☆48Oct 15, 2021Updated 4 years ago
Alternatives and similar repositories for data_augmentation_for_asr
Users that are interested in data_augmentation_for_asr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Improving Recording Device Generalization using Impulse Response Augmentation☆20Apr 24, 2025Updated 11 months ago
- A lightweight audio codec based on a single quantizer☆34Sep 4, 2025Updated 7 months ago
- Benchmarking different VAD models on AVA-Speech dataset☆18May 21, 2023Updated 2 years ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 3 years ago
- golang vad (voice activity detection) library based on webrtc☆12Dec 13, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- SoTA open-source TTS☆23Jun 17, 2025Updated 10 months ago
- Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"☆26Mar 27, 2024Updated 2 years ago
- Implementation of Sheffield entry for Clarity enhancement challenge.☆18Apr 19, 2022Updated 4 years ago
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆20Feb 9, 2025Updated last year
- Personalized Item Exploration Processes for Recommendation☆15Sep 19, 2019Updated 6 years ago
- English to French and Chinese to French .json dictionaries for Synthesizer V☆12Feb 1, 2023Updated 3 years ago
- A library for speech data augmentation in time-domain☆685Aug 30, 2021Updated 4 years ago
- 🗣️ Convert between phonetic alphabets☆11Feb 7, 2022Updated 4 years ago
- The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.☆101May 24, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆21Sep 24, 2018Updated 7 years ago
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- Discogs-VI dataset and code☆20Dec 13, 2024Updated last year
- Variable Bitrate Residual Vector Quantization for Audio Coding☆50May 1, 2025Updated 11 months ago
- [ICON 2020] TensorFlow Code for "End-to-End Automatic Speech Recognition System for Gujarati"☆13Jul 26, 2021Updated 4 years ago
- Personalized AEC☆19Nov 3, 2022Updated 3 years ago
- ☆23May 15, 2023Updated 2 years ago
- A time delay estimation method for event-based time-series data. Time delay estimation is also known as the correction of time offsets an…☆15Dec 3, 2025Updated 4 months ago
- ☆17Apr 3, 2022Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Reproduction of a paper"Small-footprint keyword spotting using deep neural networks"☆12Mar 11, 2019Updated 7 years ago
- PyTorch implementation of Listen, Attend and Spell (LAS) speech recognition paper☆12Mar 4, 2022Updated 4 years ago
- Unofficial Tensorflow/Keras implementation of Google AI VoiceFilter☆16Mar 25, 2023Updated 3 years ago
- superfast text to speech in any voice☆62Feb 16, 2026Updated 2 months ago
- A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification☆34Jun 25, 2021Updated 4 years ago
- An unofficial implementation of Vector Quantization Voice Conversion (VQVC).☆29Apr 12, 2021Updated 5 years ago
- A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.☆15May 19, 2020Updated 5 years ago
- Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡☆11Jan 23, 2025Updated last year
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [InterSpeech'2023] "Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion"☆13Mar 14, 2024Updated 2 years ago
- This repository contains the code for the paper "Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection fr…☆12Dec 19, 2025Updated 4 months ago
- MANNER: Multi-view Attention Network for Noise ERasure (Speech enhancement in time-domain)☆65Aug 29, 2022Updated 3 years ago
- ☆10Oct 25, 2019Updated 6 years ago
- Source code for AAAI 22 paper: Hybrid Neural Networks for On-Device Directional Hearing☆19Apr 10, 2024Updated 2 years ago
- LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme…☆24Aug 14, 2025Updated 8 months ago
- This in an implementation of NSNet in PyTorch and PyTorch Lightning. NSNet is a recurrent neural network for single channel speech enhanc…☆40Aug 20, 2020Updated 5 years ago