netease-youdao / EmotiVoice
EmotiVoice π: a Multi-Voice and Prompt-Controlled TTS Engine
β7,951Updated 8 months ago
Alternatives and similar repositories for EmotiVoice:
Users that are interested in EmotiVoice are comparing it to the libraries listed below
- High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.β6,012Updated 4 months ago
- Inference and training library for high-quality TTS models.β5,229Updated 4 months ago
- [SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wildβ7,015Updated 9 months ago
- Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet cβ¦β5,840Updated this week
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,017Updated 3 weeks ago
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.β13,515Updated last week
- An Open Source text-to-speech system built by inverting Whisper.β4,234Updated 3 weeks ago
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activityβ¦β10,186Updated 2 weeks ago
- AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animationβ4,936Updated 10 months ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/β7,856Updated last year
- Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.β4,494Updated last month
- Foundational model for human-like, expressive TTSβ4,104Updated 9 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ5,699Updated 8 months ago
- Multilingual Voice Understanding Modelβ5,511Updated last month
- SOTA Open Source TTSβ20,964Updated 3 weeks ago
- β1,290Updated 10 months ago
- MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpaintingβ4,059Updated 2 weeks ago
- Instant voice cloning by MIT and MyShell. Audio foundation model.β32,089Updated 2 weeks ago
- Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>β4,636Updated 2 months ago
- πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and productionβ39,745Updated 8 months ago
- Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API keyβ8,120Updated this week
- Zero-Shot Speech Editing and Text-to-Speech in the Wildβ8,253Updated last month
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"β11,659Updated this week
- MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoisingβ2,690Updated 10 months ago
- PhotoMaker [CVPR 2024]β9,911Updated 6 months ago
- A generative speech model for daily dialogue.β36,024Updated last month
- FaceChain is a deep-learning toolchain for generating your Digital-Twin.β9,388Updated 3 weeks ago
- Faster Whisper transcription with CTranslate2β15,776Updated last week
- Converts text to speech in realtimeβ2,942Updated 2 weeks ago
- β8,341Updated 10 months ago