Plachtaa / VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
☆7,553Updated 7 months ago
Related projects: ⓘ
- Foundational Models for State-of-the-Art Speech and Text Translation☆10,755Updated last month
- [CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation☆11,637Updated 2 months ago
- Faster Whisper transcription with CTranslate2☆11,378Updated 3 weeks ago
- FaceChain is a deep-learning toolchain for generating your Digital-Twin.☆8,881Updated last month
- [SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild☆6,376Updated last month
- JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.☆4,355Updated 5 months ago
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆11,412Updated 3 weeks ago
- 🔊 Text-Prompted Generative Audio Model☆35,297Updated last month
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆4,714Updated last month
- EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine☆7,201Updated last month
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆20,607Updated 2 months ago
- Official implementation of AnimateDiff.☆10,270Updated last month
- Community interface for generative AI☆8,677Updated 4 months ago
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆4,450Updated last week
- A multi-voice TTS system trained with an emphasis on quality☆12,898Updated last month
- An Open Source text-to-speech system built by inverting Whisper.☆3,772Updated 3 months ago
- StableLM: Stability AI Language Models☆15,842Updated 5 months ago
- Next generation face swapper and enhancer☆17,808Updated this week
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head☆9,973Updated 2 months ago
- Brand new TTS solution☆11,190Updated this week
- so-vits-svc fork with realtime support, improved interface and more features.☆8,674Updated this week
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆7,459Updated 2 months ago
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆33,451Updated last month
- VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models☆4,465Updated 2 months ago
- [CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model☆10,359Updated 2 months ago
- 🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation w…☆5,960Updated 2 months ago
- Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation☆14,346Updated last month
- An unofficial PyTorch implementation of the audio LM VALL-E☆2,931Updated last year
- Instant voice cloning by MIT and MyShell.☆28,390Updated 3 weeks ago
- InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥☆10,850Updated 2 months ago