ZZDoog / Speaker2DubberView external linksLinks
[ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"
☆33May 7, 2025Updated 9 months ago
Alternatives and similar repositories for Speaker2Dubber
Users that are interested in Speaker2Dubber are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie…☆23Jun 6, 2025Updated 8 months ago
- [CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.☆111Jun 21, 2024Updated last year
- Pytorch implementation for “V2C: Visual Voice Cloning”☆33Jan 28, 2023Updated 3 years ago
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆98Nov 14, 2024Updated last year
- Avatar: An easy-to-use digital portrait PPT presentation video generation system based on Gradio☆20Nov 7, 2023Updated 2 years ago
- ☆13Oct 9, 2025Updated 4 months ago
- A toolkit dedicate for speech evaluation.☆24Sep 26, 2024Updated last year
- [CVPR 2024] Official code for paper: Prompt-Enhanced Multiple Instance Learning for Weakly Supervised Video Anomaly Detection.☆26Aug 19, 2024Updated last year
- ☆13Jan 2, 2025Updated last year
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Sep 2, 2024Updated last year
- Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.☆37Jun 3, 2025Updated 8 months ago
- UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts☆40Jun 12, 2025Updated 8 months ago
- ☆16Mar 25, 2025Updated 10 months ago
- ☆16Jun 22, 2025Updated 7 months ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12May 13, 2024Updated last year
- 16k Hz Vocoder (HiFiGAN Codes and Pretrained Models)☆18Apr 3, 2023Updated 2 years ago
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆14Jun 27, 2023Updated 2 years ago
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆34Nov 23, 2023Updated 2 years ago
- [ICLR 2025] This repo is the official implementation of our paper "Learning Fine-Grained Representations through Textual Token Disentangl…☆22Jul 28, 2025Updated 6 months ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆19Dec 1, 2022Updated 3 years ago
- Received by ICLR2025☆49Nov 19, 2025Updated 2 months ago
- This is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log…☆16Oct 22, 2022Updated 3 years ago
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆23Oct 30, 2024Updated last year
- Continual Learning Benchmark for Spoken Keyword Spotting☆17Jun 7, 2022Updated 3 years ago
- PyTorch Implementation of Stepwise Monotonic Multihead Attention similar to Enhancing Monotonicity for Robust Autoregressive Transformer …☆39May 16, 2021Updated 4 years ago
- DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning…☆28Sep 7, 2025Updated 5 months ago
- ☆43Feb 8, 2025Updated last year
- ☆32Dec 24, 2025Updated last month
- [Not Official] Implementation of TC-Resnet, INTERSPEECH 2019☆22Jan 24, 2024Updated 2 years ago
- Survey on speech generation work.☆21Nov 26, 2023Updated 2 years ago
- Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…☆19May 12, 2023Updated 2 years ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆96Nov 9, 2024Updated last year
- CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding☆22Dec 17, 2025Updated 2 months ago
- ☆23Oct 17, 2024Updated last year
- FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation☆28Dec 19, 2024Updated last year
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆26Feb 22, 2024Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆29Oct 15, 2024Updated last year
- This is a winter of code project aimed at speech enhancement of text to speech models.☆24Feb 6, 2022Updated 4 years ago