myshell-ai / OpenVoice
Instant voice cloning by MIT and MyShell. Audio foundation model.
☆30,988Updated last month
Alternatives and similar repositories for OpenVoice:
Users that are interested in OpenVoice are comparing it to the libraries listed below
- SOTA Open Source TTS☆19,207Updated this week
- WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)☆13,990Updated this week
- High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.☆5,578Updated last month
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆8,508Updated 2 weeks ago
- Faster Whisper transcription with CTranslate2☆14,234Updated last month
- Robust Speech Recognition via Large-Scale Weak Supervision☆76,600Updated last month
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆8,121Updated 7 months ago
- Jan is an open source alternative to ChatGPT that runs 100% offline on your computer☆27,625Updated this week
- 🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen /…☆55,651Updated this week
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆37,794Updated 6 months ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆9,735Updated this week
- Foundational Models for State-of-the-Art Speech and Text Translation☆11,339Updated 3 months ago
- Port of OpenAI's Whisper model in C/C++☆37,876Updated this week
- Inference and training library for high-quality TTS models.☆5,025Updated 2 months ago
- 🔊 Text-Prompted Generative Audio Model☆36,988Updated 6 months ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆16,314Updated this week
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆10,766Updated this week
- 🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.☆26,234Updated this week
- Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥☆30,529Updated this week
- A generative speech model for daily dialogue.☆34,495Updated this week
- Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key☆7,396Updated 2 weeks ago
- EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine☆7,666Updated 6 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆38,475Updated this week
- Foundational model for human-like, expressive TTS☆4,035Updated 6 months ago
- MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone☆18,531Updated this week
- the AI-native open-source embedding database☆17,839Updated this week
- We write your reusable computer vision tools. 💜☆24,901Updated this week
- Convert PDF to markdown + JSON quickly with high accuracy☆20,826Updated this week
- Agno is a lightweight library for building multi-modal Agents☆19,111Updated this week
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sag…☆17,763Updated this week