jymh / SAP2-ASRView external linksLinks
☆26Jan 23, 2026Updated 3 weeks ago
Alternatives and similar repositories for SAP2-ASR
Users that are interested in SAP2-ASR are comparing it to the libraries listed below
Sorting:
- ☆30Jan 22, 2026Updated 3 weeks ago
- Digital Audio Effects in Python (material for MUSI6202@Georgiatech)☆15Nov 30, 2014Updated 11 years ago
- ☆52Jul 16, 2025Updated 6 months ago
- ☆53Oct 17, 2023Updated 2 years ago
- FLM-Audio is a audio-language subversion of RoboEgo/FLM-Ego -- an omnimodal model with native full duplexity.☆61Dec 9, 2025Updated 2 months ago
- Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024☆32Mar 14, 2025Updated 10 months ago
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆50Sep 20, 2025Updated 4 months ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆62Apr 15, 2020Updated 5 years ago
- A piano music dataset with Audio, Symbolic and Text labels☆33Mar 6, 2025Updated 11 months ago
- This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …☆57Aug 9, 2025Updated 6 months ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 10 months ago
- Pumilio: A Web-Based Management System for Ecological Recordings☆13Oct 29, 2018Updated 7 years ago
- Code, source data, examples, and audio excerpts for Flow: Expressive Rhythm in the Rapping Voice☆10Feb 13, 2020Updated 6 years ago
- Tensorflow with KenLM integrated for beam search scoring☆34Jul 28, 2017Updated 8 years ago
- Music2Emo: Towards Unified Music Emotion Recognition across Dimensional and Categorical Models☆43Aug 24, 2025Updated 5 months ago
- [Interspeech 2025] Official implementation of "Training-Free Voice Conversion with Factorized Optimal Transport"☆43Sep 24, 2025Updated 4 months ago
- We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction☆179Feb 3, 2026Updated last week
- A python algorithm to change the pitch of the voice in real time☆13Dec 13, 2020Updated 5 years ago
- TASU: A New Style of Alignment of Speech LLM with only Text Training Data, zero-shot on ASR and Other SU tasks☆21Jan 19, 2026Updated 3 weeks ago
- arxiv daily for speech translation, legal. Ref: Vincentqyw/cv-arxiv-daily☆14Jan 6, 2025Updated last year
- ☆37Jun 28, 2021Updated 4 years ago
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆49Sep 2, 2025Updated 5 months ago
- Research_speech_speaker_verification_nist_sre2010☆12Mar 1, 2016Updated 9 years ago
- PyGun: Procedural Generation of Anechoic Gunshot Sounds☆13Oct 8, 2016Updated 9 years ago
- ATC-Anno is an annotation tool for Air Traffic Control data that offers automatic semantic and concept annotation.☆12Nov 17, 2023Updated 2 years ago
- ☆10Jul 24, 2019Updated 6 years ago
- ☆12Feb 5, 2026Updated last week
- Tool for Evaluating Multilingual WS-353 and SimLex-999☆10Dec 15, 2016Updated 9 years ago
- A simple python script to follow stock market papers in your portfolio☆12Jun 29, 2020Updated 5 years ago
- Resources for "Simple Speech Representation Learning from Perceptual Data".☆11Sep 18, 2023Updated 2 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆19Nov 3, 2025Updated 3 months ago
- A Tree-LSTM-based dependency tree sentiment labeler☆15May 9, 2019Updated 6 years ago
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- Copilot with deepseek and more...☆12Mar 7, 2025Updated 11 months ago
- A custom toolkit to implement partial least squares regression (PLSR) and discriminant analysis (PLSDA) in MATLAB.☆12Jun 5, 2025Updated 8 months ago
- 1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context☆16Dec 8, 2022Updated 3 years ago
- Listen to the weather using Sonic Pi and data from Mathematica☆11Dec 6, 2018Updated 7 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming☆15Aug 20, 2024Updated last year