tencent-ailab / SongGenerationLinks
☆113Updated this week
Alternatives and similar repositories for SongGeneration
Users that are interested in SongGeneration are comparing it to the libraries listed below
Sorting:
- ☆78Updated 8 months ago
- ☆75Updated last week
- ☆60Updated this week
- Official code of the paper: Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis.☆46Updated 9 months ago
- [ICML 2025] SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation☆232Updated 3 months ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆99Updated last month
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆165Updated last year
- Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).☆110Updated 4 months ago
- Fork of ACE-Step for LoRA training with < 10 GB VRAM☆18Updated last week
- Pytorch implementation of SoundCTM☆96Updated 2 months ago
- TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching☆60Updated last month
- CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages [ACL 2025]☆160Updated last month
- Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".☆86Updated last year
- YuE with mp3 extend, exllama and GUI☆53Updated 3 months ago
- ☆95Updated 6 months ago
- Anim-400K: A dataset designed from the ground up for automated dubbing of video☆107Updated 11 months ago
- Awesome music generation model——MG²☆157Updated 2 months ago
- Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…☆184Updated last year
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆101Updated 5 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆93Updated 5 months ago
- [AAAI 2024] V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models☆25Updated last year
- ☆174Updated 5 months ago
- [AAAI 2025] VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization☆49Updated 6 months ago
- Music production for silent film clips.☆25Updated last month
- ☆40Updated 4 months ago
- Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models☆188Updated last year
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆74Updated 8 months ago
- Flexible LoRA Implementation to use with stable-audio-tools☆72Updated 9 months ago
- official code for CVPR'24 paper Diff-BGM☆64Updated 8 months ago
- ☆75Updated last year