tonychenxyz / emoknob
This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen, Run Chen, and Julia Hirschberg.
☆69Updated 6 months ago
Alternatives and similar repositories for emoknob:
Users that are interested in emoknob are comparing it to the libraries listed below
- The official implementation of EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector (TAFFC 20…☆84Updated this week
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆101Updated 3 months ago
- ☆50Updated 3 weeks ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆85Updated 4 months ago
- The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotio…☆117Updated last week
- An unofficial PyTorch implementation of VALL-E☆87Updated this week
- Zero-Shot Emotion Style Transfer☆45Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆87Updated last year
- ☆68Updated 7 months ago
- ☆40Updated 2 months ago
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆68Updated 3 weeks ago
- All generative model in one for better TTS model☆66Updated 7 months ago
- The open source code for SimpleSpeech series☆137Updated 6 months ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆152Updated last year
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".☆136Updated this week
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆52Updated 5 months ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆52Updated 5 months ago
- ☆109Updated last week
- Official release of StyleTalk dataset.☆62Updated 9 months ago
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆77Updated last month
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆140Updated 11 months ago
- ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis☆135Updated 7 months ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆73Updated 5 months ago
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆129Updated 3 months ago
- ☆71Updated 3 months ago
- The official Implementation of PeriodWave and PeriodWave-Turbo☆186Updated last week
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆253Updated last month
- [NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching☆83Updated 3 weeks ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆95Updated 5 months ago
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'☆117Updated 3 weeks ago