microsoft / VibeVoiceLinks
Open-Source Frontier Voice AI
☆20,325Updated last month
Alternatives and similar repositories for VibeVoice
Users that are interested in VibeVoice are comparing it to the libraries listed below
Sorting:
- Simultaneous speech-to-text model☆9,468Updated last week
- The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trai…☆3,001Updated last week
- SoTA open-source TTS☆21,539Updated last month
- An Open Source implementation of Notebook LM with more flexibility and features☆17,841Updated last week
- Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation☆4,444Updated 6 months ago
- LLM agents built for control. Designed for real-world use. Deployed in minutes.☆17,528Updated this week
- Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containeriz…☆10,191Updated 4 months ago
- VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning☆3,369Updated this week
- An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System☆17,740Updated last month
- On-device TTS model by Neuphonic☆4,389Updated this week
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆3,223Updated last week
- ☆6,058Updated 4 months ago
- LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.☆11,912Updated this week
- The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.☆8,651Updated last month
- Unlimited-length talking video generation that supports image-to-video and video-to-video generation☆4,452Updated last month
- Chrome DevTools for coding agents☆20,688Updated this week
- Fara-7B: An Efficient Agentic Model for Computer Use☆3,458Updated last month
- https://hf.co/hexgrad/Kokoro-82M☆5,336Updated 5 months ago
- ☆2,675Updated 2 months ago
- 💖🧸 Self hosted, you owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achiev…☆16,852Updated this week
- Text-audio foundation model from Boson AI☆7,842Updated 4 months ago
- A free, open source, and extensible speech-to-text application that works completely offline.☆10,695Updated this week
- State-of-the-art TTS model under 25MB 😻☆9,454Updated 4 months ago
- Wan: Open and Advanced Large-Scale Video Generative Models☆13,549Updated last month
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,742Updated last month
- OmniGen2: Exploration to Advanced Multimodal Generation. https://arxiv.org/abs/2506.18871☆3,986Updated last month
- Local, open-source AI app builder for power users ✨ v0 / Lovable / Replit / Bolt alternative 🌟 Star if you like it!☆19,211Updated this week
- The world's first open-source multimodal creative assistant This is a substitute for Canva and Manus that prioritizes privacy and is usa…☆5,748Updated 2 months ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆13,958Updated this week
- RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal…☆8,624Updated this week