tonychenxyz / emoknob
This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen, Run Chen, and Julia Hirschberg.
☆37Updated last month
Related projects ⓘ
Alternatives and complementary repositories for emoknob
- ☆77Updated 2 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆62Updated 2 weeks ago
- The official Implementation of PeriodWave and PeriodWave-Turbo☆129Updated 2 months ago
- VALL-E 2 reproduction☆83Updated 3 months ago
- Zero-Shot Emotion Style Transfer☆37Updated 7 months ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆91Updated last week
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆48Updated 2 weeks ago
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆91Updated last week
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆65Updated 7 months ago
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units☆25Updated 2 weeks ago
- ☆28Updated 11 months ago
- ☆61Updated 3 months ago
- All generative model in one for better TTS model☆66Updated 2 months ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆81Updated 2 weeks ago
- ☆57Updated 2 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- VoiceBox neural network implementation☆96Updated 3 months ago
- ☆34Updated 6 months ago
- ☆33Updated last year
- [Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation☆79Updated last month
- Codebase and project page for EDMSound☆29Updated 11 months ago
- Unsupervised Rhythm Modeling for Voice Conversion☆80Updated last year
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆47Updated 2 weeks ago
- GPT-style network for phonemization with durations of text☆62Updated 7 months ago
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆28Updated 3 weeks ago
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆65Updated last month
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions☆58Updated last month
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆124Updated 5 months ago
- An unofficial PyTorch implementation of VALL-E☆75Updated this week
- ☆66Updated last year