microsoft / VibeVoiceLinks
Frontier Open-Source Text-to-Speech
☆9,961Updated 2 months ago
Alternatives and similar repositories for VibeVoice
Users that are interested in VibeVoice are comparing it to the libraries listed below
Sorting:
- VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning☆2,067Updated last month
- Wan: Open and Advanced Large-Scale Video Generative Models☆11,649Updated last month
- Simultaneous speech-to-text model☆8,419Updated last week
- Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation☆4,347Updated 4 months ago
- On-device TTS model by Neuphonic☆3,965Updated 2 weeks ago
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,588Updated last month
- https://hf.co/hexgrad/Kokoro-82M☆4,792Updated 3 months ago
- State-of-the-art TTS model under 25MB 😻☆9,084Updated 2 months ago
- SoTA open-source TTS☆14,589Updated last month
- Unlimited-length talking video generation that supports image-to-video and video-to-video generation☆3,109Updated 2 months ago
- Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containeriz…☆9,509Updated 2 months ago
- Towards Human-Sounding Speech☆5,709Updated 6 months ago
- SkyReels-V2: Infinite-length Film Generative model☆4,966Updated 3 months ago
- ☆6,019Updated 2 months ago
- A simple yet powerful agent framework that delivers with open-source models☆3,831Updated this week
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆5,684Updated 2 weeks ago
- Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, im…☆2,858Updated last month
- [NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation☆2,661Updated last month
- An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System☆15,082Updated last week
- TTS with kokoro and onnx runtime☆2,255Updated 4 months ago
- SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.☆1,994Updated this week
- Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages☆1,058Updated last week
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆2,901Updated last week
- Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expres…☆7,095Updated 8 months ago
- ACE-Step: A Step Towards Music Generation Foundation Model☆3,280Updated 4 months ago
- The world's first open-source multimodal creative assistant This is a substitute for Canva and Manus that prioritizes privacy and is usa…☆5,141Updated last week
- Prompt Orchestration Markup Language☆4,727Updated this week
- GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models☆3,181Updated last month
- ⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now,…☆3,376Updated 3 weeks ago
- LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.☆7,536Updated this week