SparkAudio / Spark-TTS
Spark-TTS Inference Code
☆8,350Updated this week
Alternatives and similar repositories for Spark-TTS:
Users that are interested in Spark-TTS are comparing it to the libraries listed below
- ☆4,150Updated last month
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆12,908Updated last week
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆11,184Updated last week
- Taming Stable Diffusion for Lip Sync!☆3,577Updated this week
- Towards Human-Sounding Speech☆3,986Updated this week
- A video translation and dubbing tool powered by LLMs, offering professional-grade translations and one-click full-process deployment. It…☆1,878Updated this week
- SOTA Open Source TTS☆20,658Updated this week
- GLM-4-Voice | 端到端中英语音对话模型☆2,826Updated 4 months ago
- Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切…☆12,294Updated last week
- Multilingual Voice Understanding Model☆5,303Updated 3 weeks ago
- ☆2,781Updated 3 weeks ago
- Like Manus, Computer Use Agent(CUA) and Omniparser, we are computer-using agents.AI-driven local automation assistant that uses natural l…☆3,131Updated 2 weeks ago
- InspireMusic: A Unified Framework for Music, Song, Audio Generation.☆1,048Updated this week
- Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.☆4,406Updated last month
- Toolkit for linearizing PDFs for LLM datasets/training☆11,088Updated this week
- 🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.☆23,045Updated this week
- zero-shot voice conversion & singing voice conversion, with real-time support☆2,187Updated 3 weeks ago
- 🎨 Refly is an open-source AI-native creation engine. Its intuitive free-form canvas interface combines multi-threaded dialogues, artifac…☆3,531Updated this week
- 官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project☆1,601Updated 9 months ago
- Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"☆2,360Updated last month
- 利用AI大模型,一键解说并剪辑视频; Using AI models to automatically provide commentary and edit videos with a single click.☆4,585Updated this week
- ☆4,478Updated this week
- Real time interactive streaming digital human☆5,222Updated 2 weeks ago
- [AAAI 2025] EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning☆3,771Updated 4 months ago
- A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations☆13,475Updated this week
- https://hf.co/hexgrad/Kokoro-82M☆2,268Updated this week
- AigcPanel 是一个简单易用的一站式AI数字人系统,支持视频合成、声音合成、声音克隆,简化本地模型管理、一键导入和使用AI模型。☆2,768Updated last week
- Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion☆1,440Updated 2 weeks ago
- A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's CoT reasoning traces with Anthropic Claude models.☆5,038Updated 2 months ago
- A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.☆11,230Updated this week