ZZDoog/ProDubber

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZZDoog/ProDubber)

ZZDoog / ProDubber

[CVPR 2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing"

☆23

Alternatives and similar repositories for ProDubber

Users that are interested in ProDubber are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ZZDoog / Speaker2Dubber
View on GitHub
[ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"
☆34Jul 14, 2026Updated last week
HDUyiming / SOCCER
View on GitHub
We are very happy that our work has been accepted by ACM Multimedia 2024！🥰
☆12Jan 8, 2025Updated last year
kevendai / fandp-ijcai2025-issues
View on GitHub
☆17Oct 13, 2025Updated 9 months ago
ZZDoog / Avatar
View on GitHub
Avatar: An easy-to-use digital portrait PPT presentation video generation system based on Gradio
☆20Nov 7, 2023Updated 2 years ago
chenqi008 / V2C
View on GitHub
Pytorch implementation for “V2C: Visual Voice Cloning”
☆34Jan 28, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
GalaxyCong / StyleDubber
View on GitHub
[ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"
☆98Nov 14, 2024Updated last year
GalaxyCong / EmoDubber
View on GitHub
[CVPR 2025] Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.
☆38Jun 3, 2025Updated last year
kaistmm / AlignDiT
View on GitHub
[ACM MM 2025] AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
☆24Oct 28, 2025Updated 8 months ago
Junxi-Chen / PE-MIL
View on GitHub
[CVPR 2024] Official code for paper: Prompt-Enhanced Multiple Instance Learning for Weakly Supervised Video Anomaly Detection.
☆27Aug 19, 2024Updated last year
yili-19 / SSGPA
View on GitHub
☆17Jul 14, 2025Updated last year
GalaxyCong / HPMDubbing
View on GitHub
[CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.
☆111Jun 21, 2024Updated 2 years ago
tuyunbin / Review-of-Change-Captioning
View on GitHub
This repository offers a comprehensive overview of existing datasets and methods in the field of change captioning.
☆17Sep 2, 2025Updated 10 months ago
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
bigai-nlco / UltraVoice
View on GitHub
Official Repository of UltraVoice
☆62Oct 28, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
walker-hyf / FCTalker
View on GitHub
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)
☆26Feb 22, 2024Updated 2 years ago
walker-hyf / NCSSD
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆61Nov 1, 2024Updated last year
BiSinger-SVS / BiSinger
View on GitHub
Bilingual Singing Voice Synthesis
☆18Mar 25, 2024Updated 2 years ago
keonlee9420 / Stepwise_Monotonic_Multihead_Attention
View on GitHub
PyTorch Implementation of Stepwise Monotonic Multihead Attention similar to Enhancing Monotonicity for Robust Autoregressive Transformer …
☆39May 16, 2021Updated 5 years ago
kaist-ami / voicecraft-dub
View on GitHub
[ICCV'25] Official PyTorch Implementation of "VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models"
☆17Dec 8, 2025Updated 7 months ago
walker-hyf / ECSS
View on GitHub
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)
☆59Jun 20, 2024Updated 2 years ago
KTTRCDL / UMETTS
View on GitHub
UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts
☆41Jun 12, 2025Updated last year
dengpeihua / GROTO
View on GitHub
[CVPR 2025] Official implementation of paper "Multi-Granularity Class Prototype Topology Distillation for Class-Incremental Source-Free …
☆19Apr 16, 2026Updated 3 months ago
guozixunnicolas / FundamentalMusicEmbedding
View on GitHub
☆32Nov 25, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zeyuxie29 / SemanticVocoder
View on GitHub
☆28Apr 6, 2026Updated 3 months ago
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
light1726 / SpeechTripleNet
View on GitHub
The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"
☆33Nov 23, 2023Updated 2 years ago
ajd12342 / paraspeechclap
View on GitHub
Codebase for 'ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining'
☆23Jun 20, 2026Updated last month
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
May2333 / FDCA
View on GitHub
[ICLR 2025] This repo is the official implementation of our paper "Learning Fine-Grained Representations through Textual Token Disentangl…
☆23Jul 28, 2025Updated 11 months ago
MTG / PodcastMix-inference
View on GitHub
☆32Jan 6, 2022Updated 4 years ago
zengchang233 / xiaoicesing2
View on GitHub
The source code for the paper XiaoiceSing2 (interspeech2023)
☆49Jan 15, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
elpsykongloo / FD-SLMs
View on GitHub
This is an evolving repo for the paper “From Turn-Taking to Synchronous Dialogue: A Survey of Full-Duplex Spoken Language Models ”A compr…
☆25Dec 23, 2025Updated 6 months ago
RanaCM / DSU-AVO
View on GitHub
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12May 13, 2024Updated 2 years ago
huutuongtu / Lightvoc
View on GitHub
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
☆18May 17, 2024Updated 2 years ago
KevinMIN95 / StyleSpeech
View on GitHub
Official implementation of Meta-StyleSpeech and StyleSpeech
☆253Feb 9, 2022Updated 4 years ago
scutcsq / Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
View on GitHub
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…
☆60Apr 4, 2024Updated 2 years ago
zjzser / WMCodec
View on GitHub
PyTorch Implementation of [WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification](https://arxiv.or…
☆18Jul 31, 2025Updated 11 months ago
kaistmm / V2SFlow
View on GitHub
[ICASSP 2025] V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow
☆21Jun 3, 2025Updated last year