☆43Feb 8, 2025Updated last year
Alternatives and similar repositories for LLaSA_inference
Users that are interested in LLaSA_inference are comparing it to the libraries listed below
Sorting:
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆657Jan 21, 2026Updated last month
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆348Jul 21, 2025Updated 7 months ago
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆94Oct 8, 2025Updated 4 months ago
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 3 months ago
- Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…☆19May 12, 2023Updated 2 years ago
- ☆99Jan 19, 2026Updated last month
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆124Sep 2, 2025Updated 6 months ago
- Code for the paper "JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis"☆14Nov 5, 2024Updated last year
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆183Updated this week
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆38Feb 17, 2026Updated 2 weeks ago
- ☆40Jul 15, 2025Updated 7 months ago
- The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…☆63Dec 26, 2025Updated 2 months ago
- Compute WER and SER for speech recognition evaluation☆26Dec 15, 2025Updated 2 months ago
- [NeurIPS' 25] Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆189Dec 9, 2025Updated 2 months ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆213Sep 19, 2024Updated last year
- The baselines of ARC-Challenge-Interspeech2026☆56Dec 1, 2025Updated 3 months ago
- poorman's ar-dit tts☆45Dec 31, 2025Updated 2 months ago
- Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction☆218Feb 28, 2025Updated last year
- This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs☆90Sep 19, 2025Updated 5 months ago
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Jan 12, 2025Updated last year
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆110May 20, 2025Updated 9 months ago
- speaker-disentangled speech linguistic content quantizer☆24Mar 19, 2025Updated 11 months ago
- FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS☆24Sep 9, 2024Updated last year
- TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization☆103Feb 5, 2024Updated 2 years ago
- ☆59Oct 22, 2025Updated 4 months ago
- AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆293Oct 12, 2025Updated 4 months ago
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆56Jun 25, 2024Updated last year
- Easy-to-Use Speech MOS predictors☆346Oct 24, 2023Updated 2 years ago
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆26Jun 1, 2023Updated 2 years ago
- LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement☆98Apr 1, 2025Updated 11 months ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 7 months ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆96Nov 9, 2024Updated last year
- [ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis☆36Dec 24, 2025Updated 2 months ago
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆44Oct 28, 2024Updated last year
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆54May 15, 2025Updated 9 months ago
- g2p for english tts☆19Nov 10, 2022Updated 3 years ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated 10 months ago
- ☆13Oct 9, 2025Updated 4 months ago