☆26Jan 23, 2026Updated 2 months ago
Alternatives and similar repositories for SAP2-ASR
Users that are interested in SAP2-ASR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆54Jul 16, 2025Updated 9 months ago
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆52Sep 20, 2025Updated 6 months ago
- ☆30Jan 22, 2026Updated 2 months ago
- ☆17Feb 14, 2026Updated 2 months ago
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction☆206Apr 7, 2026Updated last week
- This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …☆57Aug 9, 2025Updated 8 months ago
- Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024☆34Mar 14, 2025Updated last year
- ☆30Updated this week
- Digital Speech Processing in PyTorch.☆15Aug 12, 2022Updated 3 years ago
- A piano music dataset with Audio, Symbolic and Text labels☆34Mar 6, 2025Updated last year
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆62Apr 15, 2020Updated 6 years ago
- [AAAI'25 Oral] "RFL: Simplifying Chemical Structure Recognition with Ring-Free Language".☆20Jun 14, 2025Updated 10 months ago
- ☆36Jan 9, 2026Updated 3 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- 将normalize过的中文文本,做逆向normalize。具体功能即实现 chinese_text_normalization的逆向版本。☆13Apr 7, 2021Updated 5 years ago
- ☆54Oct 17, 2023Updated 2 years ago
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆47Sep 2, 2025Updated 7 months ago
- ☆13Sep 25, 2024Updated last year
- Code for the paper "Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription"☆40May 5, 2024Updated last year
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- ☆18Sep 22, 2025Updated 6 months ago
- Word Sense Disambiguation system developed on the DutchSemCor project using Support Vector Machines. The input is plain text, and the out…☆12Feb 5, 2019Updated 7 years ago
- [Interspeech 2025] Official implementation of "Training-Free Voice Conversion with Factorized Optimal Transport"☆43Sep 24, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- List of 4000 Chinese characters sorted by historical usage frequency, with Cantonese yale romanization and definition☆14Dec 18, 2022Updated 3 years ago
- ☆27Jul 6, 2024Updated last year
- A lightweight audio codec based on a single quantizer☆34Sep 4, 2025Updated 7 months ago
- Neural network sequence labeling model - some sloppy modifications to the original toolkit to enable punctuation restoration in unsegment…☆10Jan 8, 2017Updated 9 years ago
- RNN model to punctuate degraded text with no punctuation, and an application that combines it with Watson TTS for automated transcription…☆10Apr 9, 2017Updated 9 years ago
- A repository for Chinese text normalization.☆20May 2, 2021Updated 4 years ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆50May 1, 2025Updated 11 months ago
- A library for adding punctuation into a text from ASR.☆19May 8, 2023Updated 2 years ago
- [NeurIPS'23] ODE-based Recurrent Model-free Reinforcement Learning for POMDPs☆18May 3, 2025Updated 11 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆12Nov 23, 2020Updated 5 years ago
- Digital Audio Effects in Python (material for MUSI6202@Georgiatech)☆15Nov 30, 2014Updated 11 years ago
- ☆24Jun 13, 2022Updated 3 years ago
- Plug-and-play streaming semantic VAD for real-time full-duplex spoken dialogue systems.☆185Mar 20, 2026Updated 3 weeks ago
- poorman's ar-dit tts☆45Dec 31, 2025Updated 3 months ago
- Official repository of Spiking-FullSubNet, the Intel N-DNS Challenge Algorithmic Track Winner.☆134Jan 28, 2026Updated 2 months ago
- Music2Emo: Towards Unified Music Emotion Recognition across Dimensional and Categorical Models☆46Aug 24, 2025Updated 7 months ago