Fast audio super resolution from 16khz to 48khz.
☆198Jan 3, 2026Updated 2 months ago
Alternatives and similar repositories for FlashSR
Users that are interested in FlashSR are comparing it to the libraries listed below
Sorting:
- LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme…☆23Aug 14, 2025Updated 6 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15May 16, 2025Updated 9 months ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- ☆70Jan 25, 2025Updated last year
- ☆21Jul 15, 2024Updated last year
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆35Sep 9, 2025Updated 5 months ago
- [ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion☆91Jul 23, 2025Updated 7 months ago
- ☆11Nov 7, 2024Updated last year
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 4 months ago
- Soprano-Factory: Train your own 2000x realtime text-to-speech model☆211Jan 13, 2026Updated last month
- [ICASSP 2025] "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"☆108Jan 17, 2025Updated last year
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"☆35Oct 23, 2025Updated 4 months ago
- A lightning fast audio upsampler.☆737Feb 26, 2026Updated last week
- A highly compressive and high-quality neural audio codec for speech models.☆257Jan 23, 2026Updated last month
- ☆14Jun 16, 2023Updated 2 years ago
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆28Sep 20, 2025Updated 5 months ago
- ☆22Jul 30, 2025Updated 7 months ago
- The open source code for SimpleSpeech series☆145Oct 8, 2024Updated last year
- Official PyTorch implementation of "EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders"☆14Sep 20, 2024Updated last year
- Extract phoneme-level timestamps from speeh audio.☆119Feb 28, 2026Updated last week
- Simple tool for speech dataset augmentation for modeling various prosodies.☆14Jan 14, 2021Updated 5 years ago
- High-performance, semantic turn detection for conversational AI☆35Oct 1, 2025Updated 5 months ago
- Whisper Speech Quality Assessment (WhiSQA)☆16Oct 14, 2025Updated 4 months ago
- ☆54Jul 16, 2025Updated 7 months ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆38Jan 6, 2024Updated 2 years ago
- SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge☆12Jun 11, 2024Updated last year
- Echo-TTS inference codebase☆135Dec 5, 2025Updated 3 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Jun 2, 2023Updated 2 years ago
- Evaluation tool used in the BigVSAN paper☆14Mar 22, 2024Updated last year
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆59Oct 23, 2024Updated last year
- ☆33Dec 23, 2025Updated 2 months ago
- Ablation study of local spectral attention (LSA) for full-band speech enhancement (SE)☆28Sep 16, 2023Updated 2 years ago
- Collection of scripts from mHuBERT-147.☆32Nov 19, 2024Updated last year
- This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…☆197Jan 25, 2026Updated last month
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆93Mar 12, 2025Updated 11 months ago
- A high quality and fast TTS repository☆505Dec 22, 2025Updated 2 months ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- ☆13Sep 12, 2024Updated last year
- Distillation of Self-Supervised Representation-Based Speech Quality Assessment☆44May 15, 2025Updated 9 months ago