Open-Source Frontier Voice AI
☆23,784Mar 6, 2026Updated 2 weeks ago
Alternatives and similar repositories for VibeVoice
Users that are interested in VibeVoice are comparing it to the libraries listed below
Sorting:
- SoTA open-source TTS☆23,651Mar 9, 2026Updated last week
- SOTA Open Source TTS☆27,364Mar 13, 2026Updated last week
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆19,202Nov 19, 2025Updated 4 months ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆14,223Updated this week
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆20,038Updated this week
- An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System☆19,484Updated this week
- VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning☆6,128Mar 13, 2026Updated last week
- Python tool for converting files and office documents to Markdown.☆91,227Updated this week
- Towards Human-Sounding Speech☆6,016Dec 5, 2025Updated 3 months ago
- 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN☆62,080Updated this week
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆81,169Updated this week
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆44,821Aug 16, 2024Updated last year
- Build, deploy, and orchestrate AI agents. Sim is the central intelligence layer for your AI workforce.☆27,066Updated this week
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆36,136Apr 19, 2025Updated 11 months ago
- Wan: Open and Advanced Large-Scale Video Generative Models☆14,751Updated this week
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆9,716May 27, 2025Updated 9 months ago
- 🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data☆93,251Mar 15, 2026Updated last week
- Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.☆54,096Updated this week
- A simple screen parsing tool towards pure vision based GUI agent☆24,546Sep 12, 2025Updated 6 months ago
- Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.☆102,903Updated this week
- The conversational control layer for customer-facing AI agents - Parlant is a context-engineering framework optimized for controlling cus…☆17,826Updated this week
- Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ in…☆179,955Updated this week
- Simultaneous speech-to-text models☆9,957Updated this week
- Spark-TTS Inference Code☆10,960Apr 9, 2025Updated 11 months ago
- A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone☆24,144Mar 7, 2026Updated 2 weeks ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆9,832Mar 4, 2026Updated 2 weeks ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆96,288Dec 15, 2025Updated 3 months ago
- User-friendly AI Interface (Supports Ollama, OpenAI API, ...)☆127,399Updated this week
- 🔊 Text-Prompted Generative Audio Model☆39,045Aug 19, 2024Updated last year
- A Conversational Speech Generation Model☆14,545May 27, 2025Updated 9 months ago
- Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation☆4,520Jun 21, 2025Updated 9 months ago
- real time face swap and one-click video deepfake with only a single image☆80,121Mar 13, 2026Updated last week
- Official inference framework for 1-bit LLMs☆35,906Mar 10, 2026Updated last week
- The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.☆106,179Updated this week
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆6,334Updated this week
- Production-ready platform for agentic workflow development.☆133,787Updated this week
- Lightweight coding agent that runs in your terminal☆65,974Updated this week
- 🙌 OpenHands: AI-Driven Development☆69,254Updated this week
- Universal memory layer for AI Agents☆50,147Updated this week