shivammehta25 / Diff-TTSG
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis
☆39Updated last year
Alternatives and similar repositories for Diff-TTSG:
Users that are interested in Diff-TTSG are comparing it to the libraries listed below
- ☆37Updated last month
- ☆11Updated 7 months ago
- ☆22Updated last month
- The project page repo for Neural Dubber.☆29Updated last year
- ☆25Updated 7 months ago
- Implementation of Multi-Source Music Generation with Latent Diffusion.☆22Updated 5 months ago
- Project page for "Improving Few-shot Learning for Talking Face System with TTS Data Augmentation" for ICASSP2023☆85Updated last year
- ☆35Updated 10 months ago
- Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)☆48Updated 3 weeks ago
- Facestar dataset. High quality audio-visual recordings of human conversational speech.☆106Updated 2 years ago
- ☆25Updated 2 years ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆69Updated last year
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆25Updated 2 months ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆49Updated 4 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆77Updated 2 months ago
- Codebase and project page for EDMSound☆33Updated last year
- [AAAI 2025] VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization☆43Updated 2 months ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆81Updated 11 months ago
- ☆41Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆63Updated 4 months ago
- Code for "SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces" ACM MM 2023☆30Updated last year
- ☆51Updated last year
- An AR+AR TTS attempt.☆13Updated last month
- ☆36Updated 5 months ago
- ☆20Updated 2 years ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆46Updated 5 months ago
- The demo page of UniAudio☆33Updated last year
- Official release of StyleTalk dataset.☆61Updated 8 months ago
- DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code☆10Updated 2 years ago