Detecting and correction dysfluencies/stuttering/stammering in audio files
☆10Apr 23, 2023Updated 2 years ago
Alternatives and similar repositories for Dysfluency-detection-and-correction
Users that are interested in Dysfluency-detection-and-correction are comparing it to the libraries listed below
Sorting:
- StutterFormer is an AI model that aims to be able to receive a speech sample with stuttering disfluencies, and return it with the disflue…☆19Feb 10, 2023Updated 3 years ago
- SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge☆12Jun 11, 2024Updated last year
- Final semester project on Stuttered Speech recognition☆17Sep 29, 2017Updated 8 years ago
- Non-parallel voice conversion called ICRCycleGAN-VC based on CycleGAN and Inception-resNet module by Afiuny☆15Oct 30, 2025Updated 4 months ago
- This is a simple implementation of Saavedra-Barrera's paper SAAVEDRA-BARRERA R H. CPU Performance Evaluation and Execution Time Predictio…☆10Nov 23, 2021Updated 4 years ago
- The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…☆50May 14, 2025Updated 9 months ago
- This the code of paper "Generative Adversarial Network Based Abnormal Behavior Detection in Massive Crowd Videos: A Hajj Case Study"☆11Jun 8, 2021Updated 4 years ago
- ☆10Oct 20, 2022Updated 3 years ago
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Feb 23, 2024Updated 2 years ago
- open-source Mandarian biased word dataset☆14Sep 21, 2023Updated 2 years ago
- Getting confidences from any end-to-end systems☆11May 24, 2023Updated 2 years ago
- uyghur text resource crawled from website☆12Dec 25, 2015Updated 10 years ago
- ☆19Jul 22, 2025Updated 7 months ago
- Target speaker automatic speech recognition (TS-ASR)☆12Oct 14, 2023Updated 2 years ago
- Code for the paper "FastAdaSP: An Efficient Multitask Inference Framework for Large Speech Language Models". @ EMNLP'24(Oral)☆13Nov 14, 2024Updated last year
- kaldi cnn-tdnnf baseline☆13Aug 31, 2021Updated 4 years ago
- ☆11Feb 14, 2025Updated last year
- ☆10Jun 8, 2022Updated 3 years ago
- ☆108Feb 7, 2024Updated 2 years ago
- Official repository for U-SAM (Interspeech 2025)☆25Jun 3, 2025Updated 9 months ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated 11 months ago
- Simple Delayed Auditory Feedback (DAF) generator. An anti-stuttering tool☆13May 10, 2020Updated 5 years ago
- zero shot side scan sonar image classification☆15Mar 29, 2021Updated 4 years ago
- Fluent is an AI Augmented Writing Tool that assists People who Stutter write scripts which they can speak fluently☆18Aug 26, 2022Updated 3 years ago
- ☆17Jun 26, 2025Updated 8 months ago
- A recipe for disfluency detection on the LibriStutter dataset using SpeechBrain☆11Mar 13, 2021Updated 4 years ago
- ☆10Apr 4, 2023Updated 2 years ago
- SLT 2024 Challenge: Post-ASR-Speaker-Tagging☆16Jun 16, 2024Updated last year
- Audio Keyword Search☆12May 5, 2019Updated 6 years ago
- NeMo: a toolkit for conversational AI☆13May 4, 2024Updated last year
- [EMNLP 2025 Findings] A complete cross-modal RAG system for end-to-end speech-to-speech large models, including ASR-based Retrieval and E…☆27Jul 11, 2025Updated 7 months ago
- Official GitHub repository for paper "SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Informa…☆22Aug 14, 2025Updated 6 months ago
- ☆13Mar 30, 2023Updated 2 years ago
- ☆15Mar 25, 2024Updated last year
- ☆17May 5, 2024Updated last year
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆18Jul 16, 2024Updated last year
- Automatic Speech Recognition (ASR) system for the Samrómur speech corpus using Kaldi☆12Sep 30, 2022Updated 3 years ago
- A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)☆38Updated this week
- This is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log…☆16Oct 22, 2022Updated 3 years ago