coqui-ai / TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆37,794Updated 6 months ago
Alternatives and similar repositories for TTS:
Users that are interested in TTS are comparing it to the libraries listed below
- 🔊 Text-Prompted Generative Audio Model☆36,988Updated 6 months ago
- A multi-voice TTS system trained with an emphasis on quality☆13,707Updated 3 months ago
- Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)☆9,636Updated last year
- Port of OpenAI's Whisper model in C/C++☆37,876Updated this week
- Robust Speech Recognition via Large-Scale Weak Supervision☆76,600Updated last month
- Faster Whisper transcription with CTranslate2☆14,234Updated last month
- A Gradio web UI for Large Language Models with support for multiple inference backends.☆42,540Updated this week
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/☆7,788Updated last year
- Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key☆7,396Updated 2 weeks ago
- EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine☆7,666Updated 6 months ago
- Easily train a good VC model with voice data <= 10 mins!☆27,116Updated 2 months ago
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆30,988Updated last month
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆37,834Updated this week
- The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.☆67,685Updated this week
- Clone a voice in 5 seconds to generate arbitrary speech in real-time☆53,542Updated 6 months ago
- [CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation☆12,341Updated 7 months ago
- Let us control diffusion models!☆31,490Updated 11 months ago
- Industry leading face manipulation platform☆21,538Updated this week
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆5,465Updated 6 months ago
- Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and creat…☆24,484Updated this week
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆21,505Updated last month
- Real-time face swap for PC streaming or video calls☆27,639Updated 3 months ago
- 🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.☆2,346Updated 11 months ago
- An Open Source text-to-speech system built by inverting Whisper.☆4,120Updated 2 months ago
- Foundational Models for State-of-the-Art Speech and Text Translation☆11,339Updated 3 months ago
- [SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild☆6,881Updated 6 months ago
- so-vits-svc fork with realtime support, improved interface and more features.☆8,894Updated this week
- Stable Diffusion web UI☆148,011Updated this week
- LLM inference in C/C++☆74,674Updated this week
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆13,990Updated this week