OpenBMB / VoxCPMLinks
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
☆2,651Updated this week
Alternatives and similar repositories for VoxCPM
Users that are interested in VoxCPM are comparing it to the libraries listed below
Sorting:
- Unlimited-length talking video generation that supports image-to-video and video-to-video generation☆3,666Updated 3 months ago
- MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting…☆1,047Updated 2 weeks ago
- Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation…☆1,245Updated 2 months ago
- [NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation☆2,695Updated 2 months ago
- A fundamental toolkit designed for music, song, and audio generation☆1,248Updated 6 months ago
- A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics…☆744Updated last week
- SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.☆2,637Updated 2 weeks ago
- VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)☆778Updated last week
- Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion☆2,121Updated 2 weeks ago
- MiMo-Audio: Audio Language Models are Few-Shot Learners☆871Updated 2 months ago
- Added vLLM support to IndexTTS for faster inference.☆926Updated last month
- The official code repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment☆984Updated this week
- Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support…☆713Updated last month
- An Open-Sourced LLM-empowered Foundation TTS System☆885Updated 2 months ago
- ☆1,948Updated last month
- ☆472Updated 6 months ago
- ☆642Updated last month
- zero-shot voice conversion & singing voice conversion, with real-time support☆3,451Updated 7 months ago
- 基于SparkTTS、OrpheusTTS等模型,提供高质量中文语音合成与声音克隆服务。☆562Updated 6 months ago
- ☆476Updated 7 months ago
- Lightning-Fast, On-Device TTS — running natively via ONNX.☆1,756Updated this week
- ☆4,569Updated 6 months ago
- ACE-Step: A Step Towards Music Generation Foundation Model☆3,392Updated 5 months ago
- Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching☆729Updated last week
- Voice Activity Detector (VAD) : low-latency, high-performance and lightweight☆1,710Updated last week
- [NeurIPS 2025] OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication☆400Updated 2 months ago
- Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.☆667Updated 2 weeks ago
- HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation☆1,196Updated last month
- ☆530Updated 2 months ago
- The showcase page of IndexTTS2☆173Updated 2 months ago