thu-ml / Bridge-TTS
Official codebase for "Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis" (https://arxiv.org/abs/2312.03491).
☆119Updated 2 months ago
Related projects: ⓘ
- CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model☆177Updated 4 months ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆68Updated 2 months ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆127Updated last year
- Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…☆141Updated 5 months ago
- Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS☆159Updated 5 months ago
- [IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer☆99Updated 5 months ago
- ☆27Updated 9 months ago
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆118Updated 3 months ago
- The open source code for LLM-Codec☆106Updated last month
- Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".☆55Updated last month
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆146Updated 2 months ago
- All generative model in one for better TTS model☆64Updated last week
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆39Updated last week
- ☆97Updated last week
- Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models☆148Updated 3 months ago
- ☆64Updated last year
- Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for V…☆163Updated last month
- [Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers a…☆60Updated 5 months ago
- Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆75Updated 2 weeks ago
- The official Implementation of PeriodWave and PeriodWave-Turbo☆107Updated last month
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆111Updated 6 months ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆79Updated 2 months ago
- Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)☆211Updated this week
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆52Updated 3 weeks ago
- SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆34Updated 2 months ago
- unofficial implementation of the High Fidelity Neural Audio Compression☆129Updated last month
- official code for CVPR'24 paper Diff-BGM☆38Updated 5 months ago
- Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion☆132Updated 11 months ago
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"☆295Updated 2 weeks ago
- Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer☆73Updated last year