Plachtaa / VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
☆7,751Updated 11 months ago
Alternatives and similar repositories for VALL-E-X:
Users that are interested in VALL-E-X are comparing it to the libraries listed below
- 🔊 Text-Prompted Generative Audio Model☆36,678Updated 4 months ago
- Foundational Models for State-of-the-Art Speech and Text Translation☆11,156Updated 2 months ago
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆36,915Updated 5 months ago
- An unofficial PyTorch implementation of the audio LM VALL-E☆2,983Updated last year
- JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.☆4,499Updated 9 months ago
- PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html☆2,079Updated last year
- Faster Whisper transcription with CTranslate2☆13,490Updated 2 weeks ago
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆21,325Updated this week
- EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine☆7,594Updated 5 months ago
- 🔊 Text-prompted Generative Audio Model - With the ability to clone voices☆3,213Updated 7 months ago
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head☆10,071Updated 6 months ago
- [CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation☆12,216Updated 6 months ago
- Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key☆6,814Updated 3 weeks ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆5,265Updated 5 months ago
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆13,382Updated this week
- A multi-voice TTS system trained with an emphasis on quality☆13,510Updated last month
- so-vits-svc fork with realtime support, improved interface and more features.☆8,848Updated this week
- High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.☆5,400Updated 3 weeks ago
- Community interface for generative AI☆8,896Updated 8 months ago
- 🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation w…☆6,062Updated 6 months ago
- ☆7,720Updated 9 months ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.☆3,699Updated last week
- AudioLDM: Generate speech, sound effects, music and beyond, with text.☆2,527Updated last month
- Inference and training library for high-quality TTS models.☆4,910Updated last month
- Foundational model for human-like, expressive TTS☆3,979Updated 5 months ago
- An Open Source text-to-speech system built by inverting Whisper.☆4,080Updated last month
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆8,947Updated this week
- Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch☆1,457Updated 2 months ago
- Text-to-Audio/Music Generation☆2,355Updated 3 months ago
- VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech☆7,042Updated last year