kaistmm / VoiceDiTView external linksLinks
[ICASSP2025] Official code for VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis
☆52Apr 9, 2025Updated 10 months ago
Alternatives and similar repositories for VoiceDiT
Users that are interested in VoiceDiT are comparing it to the libraries listed below
Sorting:
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis☆27Mar 21, 2025Updated 10 months ago
- ☆19Apr 18, 2024Updated last year
- This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.☆34Oct 11, 2025Updated 4 months ago
- Unofficial implementation of ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech☆19Feb 9, 2025Updated last year
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆12Sep 29, 2025Updated 4 months ago
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Jun 13, 2024Updated last year
- ☆15Nov 11, 2024Updated last year
- Feed-forward compressor experiments source code for "Differentiable All-pole Filters for Time-varying Audio Systems".☆22Jun 10, 2024Updated last year
- Code for the paper "JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis"☆14Nov 5, 2024Updated last year
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆50Sep 20, 2025Updated 4 months ago
- Pytorch implementation of SoundCTM☆100Mar 31, 2025Updated 10 months ago
- Self-supervised Generative LM-based Voice Conversion☆54Apr 24, 2025Updated 9 months ago
- VoiceLDM: Text-to-Speech with Environmental Context☆191Aug 9, 2024Updated last year
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆36Jun 6, 2024Updated last year
- UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts☆40Jun 12, 2025Updated 8 months ago
- [ICASSP 2024] Official code for FreGrad