ivcylc / OpenMusic
OpenMusic: SOTA Text-to-music (TTM) Generation
☆515Updated 3 weeks ago
Alternatives and similar repositories for OpenMusic:
Users that are interested in OpenMusic are comparing it to the libraries listed below
- InspireMusic: A Unified Framework for Music, Song, Audio Generation.☆339Updated this week
- High-quality Text-to-Audio Generation with Efficient Diffusion Transformer☆254Updated 2 months ago
- Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks☆874Updated this week
- Awesome music generation model——MG²☆131Updated last week
- FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝☆512Updated 6 months ago
- TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching☆595Updated this week
- Identity-Preserving Text-to-Video Generation by Frequency Decomposition☆559Updated this week
- Official comfyui repository of Hellomeme☆320Updated 2 weeks ago
- MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code☆528Updated 3 months ago
- PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model☆629Updated 8 months ago
- This is the official reproduction of FancyVideo.☆670Updated 2 months ago
- The official HelloMeme GitHub site☆549Updated 2 weeks ago
- Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple te…☆1,026Updated 3 weeks ago
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".☆867Updated 3 months ago
- Interface for OuteTTS models.☆899Updated last week
- Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。☆1,579Updated last week
- SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling☆929Updated 3 weeks ago
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆381Updated 4 months ago
- This is the official repository for M2UGen☆466Updated 3 weeks ago
- Memory-Guided Diffusion for Expressive Talking Video Generation☆688Updated this week
- We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without an…☆1,116Updated 3 weeks ago
- Generative models for conditional audio generation☆137Updated last month
- Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models☆760Updated this week
- Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing…☆225Updated last month
- [arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis☆1,024Updated this week
- Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stabili…☆159Updated 6 months ago
- [ICLR2025] DisPose: Disentangling Pose Guidance for Controllable Human Image Animation☆310Updated last week
- An Open-Sourced LLM-empowered Foundation TTS System☆534Updated 3 months ago
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆140Updated 8 months ago
- Implementation of "DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation"☆572Updated last month