microsoft / VibeVoiceLinks
Open-Source Frontier Voice AI
☆11,186Updated this week
Alternatives and similar repositories for VibeVoice
Users that are interested in VibeVoice are comparing it to the libraries listed below
Sorting:
- VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning☆2,221Updated last month
- https://hf.co/hexgrad/Kokoro-82M☆4,996Updated 4 months ago
- Unlimited-length talking video generation that supports image-to-video and video-to-video generation☆3,666Updated 3 months ago
- Wan: Open and Advanced Large-Scale Video Generative Models☆12,282Updated 3 weeks ago
- Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, im…☆3,018Updated last month
- A research prototype of a human-centered web agent☆8,367Updated last week
- Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containeriz…☆9,842Updated 2 months ago
- Simultaneous speech-to-text model☆8,942Updated last week
- GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models☆3,265Updated this week
- Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.☆6,290Updated 3 weeks ago
- The world's first open-source multimodal creative assistant This is a substitute for Canva and Manus that prioritizes privacy and is usa…☆5,291Updated 3 weeks ago
- State-of-the-art TTS model under 25MB 😻☆9,218Updated 3 months ago
- SoTA open-source TTS☆14,907Updated 2 months ago
- On-device TTS model by Neuphonic☆4,167Updated 2 weeks ago
- Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation☆4,368Updated 5 months ago
- OmniGen2: Exploration to Advanced Multimodal Generation.☆3,951Updated this week
- SkyReels-V2: Infinite-length Film Generative model☆5,081Updated 3 months ago
- ☆6,040Updated 3 months ago
- A simple yet powerful agent framework that delivers with open-source models☆3,887Updated last week
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,636Updated last week
- [NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation☆2,695Updated 2 months ago
- Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion☆2,104Updated last week
- Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages☆2,358Updated 2 weeks ago
- Prompt Orchestration Markup Language☆4,765Updated this week
- Generate audiobooks from EPUBs, PDFs and text with synchronized captions.☆3,924Updated this week
- MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.☆3,000Updated 5 months ago
- ACE-Step: A Step Towards Music Generation Foundation Model☆3,392Updated 5 months ago
- Build, enrich, and transform datasets using AI models with no code☆1,586Updated last month
- Qwen Code is a coding agent that lives in the digital world.☆16,028Updated this week
- Lightning-fast, on-device TTS — running natively via ONNX.☆1,579Updated last week