microsoft / VibeVoiceLinks
Open-Source Frontier Voice AI
☆22,955Updated this week
Alternatives and similar repositories for VibeVoice
Users that are interested in VibeVoice are comparing it to the libraries listed below
Sorting:
- VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning☆5,715Updated 2 weeks ago
- Simultaneous speech-to-text model☆9,644Updated 3 weeks ago
- SoTA open-source TTS☆22,346Updated this week
- Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streamin…☆6,994Updated this week
- Send a phone call from AI agent, in an API call. Or, directly call the bot from the configured phone number!☆6,167Updated 3 months ago
- On-device TTS model by Neuphonic☆4,768Updated last week
- The world's first open-source multimodal creative assistant This is a substitute for Canva and Manus that prioritizes privacy and is usa…☆5,849Updated 3 months ago
- State-of-the-art TTS model under 25MB 😻☆9,590Updated last week
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆5,842Updated this week
- Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation☆4,477Updated 7 months ago
- An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System☆18,601Updated 2 months ago
- Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.☆2,552Updated 2 weeks ago
- ☆11,124Updated this week
- Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containeriz…☆10,379Updated 4 months ago
- The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trai…☆3,256Updated last month
- LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.☆12,913Updated this week
- Unlimited-length talking video generation that supports image-to-video and video-to-video generation☆4,750Updated last month
- Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.☆7,221Updated last month
- A research prototype of a human-centered web agent☆9,632Updated 2 weeks ago
- ☆9,830Updated last week
- SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.☆3,142Updated last month
- An Open Source implementation of Notebook LM with more flexibility and features☆19,137Updated last week
- SkyReels-V2: Infinite-length Film Generative model☆6,212Updated last week
- Wan: Open and Advanced Large-Scale Video Generative Models☆14,122Updated last month
- A simple yet powerful agent framework that delivers with open-source models☆4,375Updated this week
- Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages☆2,632Updated last month
- Towards Human-Sounding Speech☆5,935Updated 2 months ago
- Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost …☆24,939Updated 2 months ago
- Contexts Optical Compression☆22,430Updated 2 weeks ago
- "DeepTutor: AI-Powered Personalized Learning Assistant"☆10,145Updated this week