GalaxyCong/StyleDubber

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/GalaxyCong/StyleDubber)

GalaxyCong / StyleDubber

[ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"

☆98

Alternatives and similar repositories for StyleDubber

Users that are interested in StyleDubber are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GalaxyCong / HPMDubbing
View on GitHub
[CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.
☆111Jun 21, 2024Updated 2 years ago
GalaxyCong / EmoDubber
View on GitHub
[CVPR 2025] Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.
☆38Jun 3, 2025Updated last year
walker-hyf / NCSSD
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆61Nov 1, 2024Updated last year
ZZDoog / ProDubber
View on GitHub
[CVPR 2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie…
☆23Jun 6, 2025Updated last year
GalaxyCong / HPMDubbing_Vocoder
View on GitHub
16k Hz Vocoder (HiFiGAN Codes and Pretrained Models)
☆18Apr 3, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
chenqi008 / V2C
View on GitHub
Pytorch implementation for “V2C: Visual Voice Cloning”
☆34Jan 28, 2023Updated 3 years ago
AI-S2-Lab / FluentEditor
View on GitHub
[InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
☆62Oct 23, 2024Updated last year
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆147Aug 18, 2024Updated last year
scutcsq / Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
View on GitHub
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…
☆60Apr 4, 2024Updated 2 years ago
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated 2 years ago
Choddeok / EmoSphere-TTS
View on GitHub
[INTERSPEECH 2024] The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for …
☆182Updated this week
nii-yamagishilab / ZMM-TTS
View on GitHub
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
☆183Mar 6, 2024Updated 2 years ago
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
winddori2002 / DEX-TTS
View on GitHub
DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability
☆108Jan 17, 2025Updated last year
kaistmm / AlignDiT
View on GitHub
[ACM MM 2025] AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
☆24Oct 28, 2025Updated 8 months ago
XiangLi2022 / CM-TTS
View on GitHub
[Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers a…
☆68Mar 31, 2024Updated 2 years ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
kaist-ami / voicecraft-dub
View on GitHub
[ICCV'25] Official PyTorch Implementation of "VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models"
☆17Dec 8, 2025Updated 7 months ago
walker-hyf / GPT-Talker
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆78Nov 1, 2024Updated last year
jishengpeng / TextrolSpeech
View on GitHub
[ICASSP 2024] TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
☆187Nov 22, 2024Updated last year
thuhcsi / VoxInstruct
View on GitHub
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
☆100Nov 9, 2024Updated last year
p0p4k / pflowtts_pytorch
View on GitHub
Unofficial implementation of NVIDIA P-Flow TTS paper
☆228Dec 24, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
adelacvg / detail_tts
View on GitHub
All generative model in one for better TTS model
☆74Sep 8, 2024Updated last year
X-LANCE / StoryTTS
View on GitHub
[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations
☆141Apr 27, 2024Updated 2 years ago
cantabile-kwok / vec2wav2.0
View on GitHub
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
☆79Dec 3, 2024Updated last year
hs-oh-prml / DiffProsody
View on GitHub
☆69Jul 29, 2023Updated 2 years ago
line / LibriTTS-P
View on GitHub
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
☆161Jun 13, 2024Updated 2 years ago
ex3ndr / supervoice-enhance
View on GitHub
Supervoice diffusion enhance
☆28Jul 15, 2024Updated 2 years ago
walker-hyf / ECSS
View on GitHub
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)
☆59Jun 20, 2024Updated 2 years ago
hayeong0 / DDDM-VC
View on GitHub
Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for V…
☆244Jul 31, 2024Updated last year
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
glory20h / VoiceLDM
View on GitHub
VoiceLDM: Text-to-Speech with Environmental Context
☆194Aug 9, 2024Updated last year
yl4579 / PL-BERT
View on GitHub
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
☆269Jan 13, 2025Updated last year
ZZDoog / Speaker2Dubber
View on GitHub
[ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"
☆34Jul 14, 2026Updated last week
X-E-Speech / X-E-Speech-code
View on GitHub
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
☆112Apr 1, 2024Updated 2 years ago
line / promptttspp
View on GitHub
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions
☆86Oct 11, 2024Updated last year
mct10 / RepCodec
View on GitHub
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
☆195Jul 12, 2024Updated 2 years ago
flamed-tts / Flamed-TTS
View on GitHub
This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …
☆57Aug 9, 2025Updated 11 months ago