freds0/data_augmentation_for_asr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/freds0/data_augmentation_for_asr)

freds0 / data_augmentation_for_asr

A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.

☆49

Alternatives and similar repositories for data_augmentation_for_asr

Users that are interested in data_augmentation_for_asr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

theMoro / DIRAugmentation
View on GitHub
Improving Recording Device Generalization using Impulse Response Augmentation
☆21Apr 24, 2025Updated last year
Anwarvic / VAD_Benchmark
View on GitHub
Benchmarking different VAD models on AVA-Speech dataset
☆19May 21, 2023Updated 3 years ago
freds0 / kabooks
View on GitHub
KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…
☆13Mar 24, 2023Updated 3 years ago
henryleu / go-vad
View on GitHub
golang vad (voice activity detection) library based on webrtc
☆12Dec 13, 2021Updated 4 years ago
primepake / F5-TTS-meanflow-multilingual
View on GitHub
Meanflow and multilingual for F5-TTS model
☆16Aug 23, 2025Updated 11 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Audio-AGI / dcase2024_task9_baseline
View on GitHub
Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"
☆26Mar 27, 2024Updated 2 years ago
TuZehai / Sheffield_Clarity_CEC1_Entry
View on GitHub
Implementation of Sheffield entry for Clarity enhancement challenge.
☆18Apr 19, 2022Updated 4 years ago
iot-salzburg / nearest-advocate
View on GitHub
A time delay estimation method for event-based time-series data. Time delay estimation is also known as the correction of time offsets an…
☆16Dec 3, 2025Updated 7 months ago
nikolakopoulos / Personalized-Diffusions
View on GitHub
Personalized Item Exploration Processes for Recommendation
☆15Sep 19, 2019Updated 6 years ago
Le-Xiaohuai-speech / GMM_VAD
View on GitHub
☆17Apr 3, 2022Updated 4 years ago
facebookresearch / WavAugment
View on GitHub
A library for speech data augmentation in time-domain
☆689Aug 30, 2021Updated 4 years ago
RookieJunChen / Inter-SubNet
View on GitHub
The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.
☆102May 24, 2023Updated 3 years ago
rossellhayes / ipa
View on GitHub
🗣️ Convert between phonetic alphabets
☆11Feb 7, 2022Updated 4 years ago
mmemim / SynthV-FrenchDictionary
View on GitHub
English to French and Chinese to French .json dictionaries for Synthesizer V
☆15Feb 1, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
alumae / voxlingua107_sb
View on GitHub
VoxLingua107 recipe for SpeechBrain
☆13Jul 3, 2021Updated 5 years ago
EMRAI / emrai-synthetic-diarization-corpus
View on GitHub
☆22Sep 24, 2018Updated 7 years ago
kinggongzilla / DCASE2023_Task2
View on GitHub
☆23May 15, 2023Updated 3 years ago
CoEDL / kaldi_helpers
View on GitHub
A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
☆15May 19, 2020Updated 6 years ago
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
Xianchao-Wu / wenet-deep-sparse-conformer
View on GitHub
☆15Aug 25, 2022Updated 3 years ago
Okrio / tinyrecurrentunet
View on GitHub
Real-Time De-noising and De-reverbing with Tiny Recurrent UNet
☆56Jun 7, 2023Updated 3 years ago
echocatzh / GTCNN
View on GitHub
Personalized AEC
☆19Nov 3, 2022Updated 3 years ago
yingtaoHuo / wakeUp
View on GitHub
Reproduction of a paper"Small-footprint keyword spotting using deep neural networks"
☆12Mar 11, 2019Updated 7 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
isHuangZiling / D-LGTSE
View on GitHub
☆24Jul 19, 2026Updated last week
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
01-vyom / End_2_End_Automatic_Speech_Recognition_For_Gujarati
View on GitHub
[ICON 2020] TensorFlow Code for "End-to-End Automatic Speech Recognition System for Gujarati"
☆13Jul 26, 2021Updated 5 years ago
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
odunola499 / f5-lora
View on GitHub
☆19Nov 18, 2025Updated 8 months ago
msalhab96 / Listen-Attend-and-Spell
View on GitHub
PyTorch implementation of Listen, Attend and Spell (LAS) speech recognition paper
☆12Mar 4, 2022Updated 4 years ago
lixilinx / IVA4Cocktail
View on GitHub
Neural network density models for speech separation.
☆20Nov 26, 2020Updated 5 years ago
angsaikia / voice-filter
View on GitHub
Unofficial Tensorflow/Keras implementation of Google AI VoiceFilter
☆16Mar 25, 2023Updated 3 years ago
jjunak-yun / FLowHigh_code
View on GitHub
[ICASSP 2025] "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"
☆118Jan 17, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
WangHelin1997 / SpecAugment-plus
View on GitHub
A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification
☆34Jun 25, 2021Updated 5 years ago
Jackson-Kang / VQVC-Pytorch
View on GitHub
An unofficial implementation of Vector Quantization Voice Conversion (VQVC).
☆29Apr 12, 2021Updated 5 years ago
juice500ml / xlm_to_xlsr
View on GitHub
Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)
☆12Mar 12, 2024Updated 2 years ago
FabioSmuu / Base-DiscordBot
View on GitHub
Esta é a minha handler para criar bot de disord.
☆11Sep 5, 2024Updated last year
wanganran / HybridBeam
View on GitHub
Source code for AAAI 22 paper: Hybrid Neural Networks for On-Device Directional Hearing
☆19Apr 10, 2024Updated 2 years ago
pengzhendong / welm
View on GitHub
One command to build TLG.fst for WeNet.
☆30Oct 11, 2022Updated 3 years ago
winddori2002 / MANNER
View on GitHub
MANNER: Multi-view Attention Network for Noise ERasure (Speech enhancement in time-domain)
☆65Aug 29, 2022Updated 3 years ago