GiantAILab / DiaMoE-TTSLinks
Official code for"DiaMoE-TTS: A Unified IPA-based Dialect TTS Framework with Mixture-of-Experts and Parameter-Efficient Zero-Shot Adaptation"
☆222Updated 2 months ago
Alternatives and similar repositories for DiaMoE-TTS
Users that are interested in DiaMoE-TTS are comparing it to the libraries listed below
Sorting:
- FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.☆240Updated 2 months ago
- We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction☆174Updated last week
- ☆23Updated last year
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆110Updated last year
- ☆114Updated 3 months ago
- MOSS-Speech is a true speech-to-speech large language model without text guidance.☆120Updated last month
- OpenS2S : Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model☆105Updated 6 months ago
- ☆43Updated 11 months ago
- Official code for "EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting"☆107Updated 3 months ago
- ☆170Updated 5 months ago
- X-Talk is an open-source full-duplex cascaded spoken dialogue system framework enabling low-latency, interruptible, and human-like speech…☆165Updated this week
- CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!☆113Updated 5 months ago
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆110Updated 8 months ago
- An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.☆197Updated last week
- An Open-Source Project to Unify Audio Processing and Generation☆174Updated this week
- ☆105Updated 4 months ago
- Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction☆216Updated 11 months ago
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆131Updated 4 months ago
- A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.☆104Updated 8 months ago
- Curated list for papers, codes and resources related to Text-to-Audio (TTA) Generation☆69Updated last week
- Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.☆114Updated last month
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆92Updated 3 months ago
- [NeurIPS' 25] Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.☆188Updated last month
- Text-audio foundation model from Boson AI☆117Updated 4 months ago
- This is a repository for fine-tuning Qwen2-Audio, currently supporting Distributed Data Parallel (DDP) and DeepSpeed.☆48Updated 6 months ago
- This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…☆72Updated 4 months ago
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Updated last year
- ☆96Updated 3 months ago
- The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…☆62Updated last month
- Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems☆71Updated last week