edwko / OuteTTS
Interface for OuteTTS models.
☆317Updated this week
Related projects ⓘ
Alternatives and complementary repositories for OuteTTS
- Open source inference code for Rev's model☆331Updated 2 weeks ago
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".☆755Updated 2 weeks ago
- Implementation of F5-TTS in MLX☆311Updated last week
- Have a natural voice conversation with an LLM☆222Updated this week
- Local SRT/LLM/TTS Voicechat☆535Updated last month
- Whisper with Medusa heads☆800Updated last week
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆219Updated 2 months ago
- High-quality Text-to-Audio Generation with Efficient Diffusion Transformer☆234Updated 3 weeks ago
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆699Updated this week
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆81Updated last month
- first base model for full-duplex conversational audio☆1,362Updated this week
- Joint speech-language model - respond directly to audio!☆355Updated 4 months ago
- Collection of Open Source Speech Data☆144Updated this week
- ☆251Updated 7 months ago
- ☆443Updated this week
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch☆340Updated last week
- ☆171Updated 11 months ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆252Updated 2 months ago
- A lightweight end-to-end text-to-speech model☆90Updated last month
- podcastfy.ai gradio demo app☆309Updated 2 weeks ago
- ⚡ Insanely fast AI voice assistant with <500ms response times☆302Updated 2 months ago
- Ultimate Vocal Remover 5 with Gradio UI. Separate an audio file into various stems, using multiple models☆203Updated this week
- Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆237Updated 2 months ago
- ☆303Updated 2 months ago
- 📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)☆161Updated last week
- ☆170Updated 2 months ago
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆135Updated 3 months ago
- Building Blocks for Multi-Modal Gradio Powered by Groq Apps☆59Updated last week
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆200Updated 3 weeks ago
- ☆295Updated 4 months ago