☆26Jan 23, 2026Updated 2 months ago
Alternatives and similar repositories for SAP2-ASR
Users that are interested in SAP2-ASR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆55Jul 16, 2025Updated 8 months ago
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆51Sep 20, 2025Updated 6 months ago
- ☆30Jan 22, 2026Updated 2 months ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 11 months ago
- We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction☆196Mar 19, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024☆32Mar 14, 2025Updated last year
- This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …☆57Aug 9, 2025Updated 7 months ago
- Digital Speech Processing in PyTorch.☆15Aug 12, 2022Updated 3 years ago
- A piano music dataset with Audio, Symbolic and Text labels☆34Mar 6, 2025Updated last year
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆62Apr 15, 2020Updated 5 years ago
- ☆35Jan 9, 2026Updated 2 months ago
- ☆52Oct 17, 2023Updated 2 years ago
- 将normalize过的中文文本,做逆向normalize。具体功能即实现 chinese_text_normalization的逆向版本。☆13Apr 7, 2021Updated 4 years ago
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆48Sep 2, 2025Updated 6 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆13Sep 25, 2024Updated last year
- Plug-and-play streaming semantic VAD for real-time full-duplex spoken dialogue systems.☆155Updated this week
- Code for the paper "Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription"☆40May 5, 2024Updated last year
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- ☆18Sep 22, 2025Updated 6 months ago
- Word Sense Disambiguation system developed on the DutchSemCor project using Support Vector Machines. The input is plain text, and the out…☆12Feb 5, 2019Updated 7 years ago
- A custom toolkit to implement partial least squares regression (PLSR) and discriminant analysis (PLSDA) in MATLAB.☆12Jun 5, 2025Updated 9 months ago
- Java API for android acoustic echo cancellation.☆15Jul 2, 2016Updated 9 years ago
- [Interspeech 2025] Official implementation of "Training-Free Voice Conversion with Factorized Optimal Transport"☆43Sep 24, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A lightweight audio codec based on a single quantizer☆33Sep 4, 2025Updated 6 months ago
- RNN model to punctuate degraded text with no punctuation, and an application that combines it with Watson TTS for automated transcription…☆10Apr 9, 2017Updated 8 years ago
- A repository for Chinese text normalization.☆20May 2, 2021Updated 4 years ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆50May 1, 2025Updated 10 months ago
- Spatial active noise control based on kernel interpolation of sound field☆14Mar 30, 2023Updated 2 years ago
- [NeurIPS'23] ODE-based Recurrent Model-free Reinforcement Learning for POMDPs☆18May 3, 2025Updated 10 months ago
- Digital Audio Effects in Python (material for MUSI6202@Georgiatech)☆15Nov 30, 2014Updated 11 years ago
- ☆24Jun 13, 2022Updated 3 years ago
- poorman's ar-dit tts☆45Dec 31, 2025Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official repository of Spiking-FullSubNet, the Intel N-DNS Challenge Algorithmic Track Winner.☆129Jan 28, 2026Updated last month
- Music2Emo: Towards Unified Music Emotion Recognition across Dimensional and Categorical Models☆47Aug 24, 2025Updated 7 months ago
- This collection of utilities that complements LIBSVM provides tools to convert ARFF file to SVM file format, shuffling, remapping, and di…☆13Aug 20, 2011Updated 14 years ago
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆75Aug 24, 2024Updated last year
- Self-supervised Generative LM-based Voice Conversion☆55Apr 24, 2025Updated 11 months ago
- This setup allows to train end-to-end neural models for spoken language understanding (SLU).☆11Jun 12, 2023Updated 2 years ago
- AudioLDM training, finetuning, evaluation and inference.☆14Mar 27, 2024Updated 2 years ago