tencent-ailab / SongGenerationLinks
☆546Updated this week
Alternatives and similar repositories for SongGeneration
Users that are interested in SongGeneration are comparing it to the libraries listed below
Sorting:
- FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝☆606Updated 11 months ago
- [ICML 2025] SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation☆245Updated last week
- TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching☆756Updated last month
- ☆222Updated last week
- InspireMusic: A toolkit designed for music, song, and audio generation☆1,130Updated last month
- ☆433Updated last month
- DICE-Talk is a diffusion-based emotional talking head generation method that can generate vivid and diverse emotions for speaking portrai…☆220Updated 2 months ago
- MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting…☆353Updated this week
- High-quality Text-to-Audio Generation with Efficient Diffusion Transformer☆301Updated last week
- An Open-Sourced LLM-empowered Foundation TTS System☆744Updated last month
- OpenMusic: SOTA Text-to-music (TTM) Generation☆597Updated 2 weeks ago
- ☆643Updated last week
- ☆421Updated 2 months ago
- KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution☆333Updated 3 weeks ago
- SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers☆540Updated last month
- Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation☆1,207Updated last week
- [ICCV2025] Official Pytorch Implementation of FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait.☆285Updated 2 weeks ago
- ☆270Updated 3 months ago
- Awesome music generation model——MG²☆158Updated 3 months ago
- LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)☆236Updated this week
- Added vLLM support to IndexTTS for faster inference.☆287Updated this week
- ☆287Updated last year
- YuE with mp3 extend, exllama and GUI☆55Updated 4 months ago
- ☆79Updated last month
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆166Updated last year
- Fork of ACE-Step for LoRA training with < 10 GB VRAM☆23Updated 3 weeks ago
- ☆498Updated 2 weeks ago
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆586Updated 3 months ago
- [CVPR 2025] HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation☆261Updated last month
- Extension of ChatTTS, 3x Faster on Windows, Support Voice Cloning and Mobile Deployment☆166Updated 5 months ago