Bartelds / asr-augmentationLinks
Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation
☆17Updated 2 years ago
Alternatives and similar repositories for asr-augmentation
Users that are interested in asr-augmentation are comparing it to the libraries listed below
Sorting:
- A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features☆10Updated 2 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆88Updated last year
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆43Updated 5 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆53Updated 3 months ago
- ☆25Updated last year
- Analysis of XLS-R for Speech Quality Assessment☆14Updated 7 months ago
- The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at I…☆17Updated 2 years ago
- Wav2vec 2.0 Self-Supervised Pretraining☆50Updated 7 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆94Updated 8 months ago
- ☆43Updated 11 months ago
- ☆24Updated 4 months ago
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…☆38Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆10Updated 4 months ago
- ☆15Updated last year
- NOTSOFAR-1 Challenge: Distant Diarization and ASR☆57Updated 7 months ago
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆42Updated 2 years ago
- ☆16Updated last year
- ☆19Updated last year
- Mispronunciation Detection using a pretrained and finetuned wav2vec2 model for phoneme recognition and diagnosis and feedback using large…☆29Updated last year
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Updated 2 years ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆59Updated last year
- [ICASSP 2025] Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners".☆32Updated 2 months ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆148Updated last year
- ☆68Updated last year
- Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…☆40Updated 7 months ago
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆78Updated last year
- Repository for "LLM-based speaker diarization correction: A generalizable approach" paper☆16Updated last year
- ☆85Updated last year
- ConMamba for Automatic Speech Recognition☆86Updated last year
- A TTS model that makes a speaker speak new languages☆76Updated last year