ivcylc / qa-mdt
OpenMusic: SOTA Text-to-music (TTM) Generation
☆479Updated last week
Related projects ⓘ
Alternatives and complementary repositories for qa-mdt
- High-quality Text-to-Audio Generation with Efficient Diffusion Transformer☆238Updated last week
- The official HelloMeme GitHub site☆171Updated last week
- Interface for OuteTTS models.☆406Updated 2 weeks ago
- PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model☆752Updated 5 months ago
- FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝☆466Updated 3 months ago
- Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。☆1,565Updated 2 weeks ago
- Awesome music generation model——MG²☆112Updated this week
- Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models☆647Updated 2 months ago
- MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code☆424Updated last month
- Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing…☆225Updated 3 weeks ago
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆365Updated 2 months ago
- Fine-tune Stable Audio Open with DiT ControlNet.☆177Updated 2 months ago
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".☆781Updated 3 weeks ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch☆355Updated 2 weeks ago
- Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 🔥☆307Updated this week
- Ultimate Vocal Remover 5 with Gradio UI. Separate an audio file into various stems, using multiple models☆214Updated last week
- Open source inference code for Rev's model☆333Updated this week
- Nendo is an open source platform for AI-driven audio management, intelligence, and generation.☆117Updated 8 months ago
- Generative models for conditional audio generation☆117Updated this week
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆222Updated 2 months ago
- ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec☆196Updated 2 months ago
- AMT-APC: AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model☆54Updated 2 weeks ago
- Mustango: Toward Controllable Text-to-Music Generation☆342Updated 3 months ago
- Official Pytorch implementation of StreamV2V.☆450Updated 2 months ago
- ☆139Updated last month
- ☆253Updated 8 months ago
- Customized ID Consistent for human☆845Updated 3 months ago
- [IJCAI'24] Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer☆185Updated 2 months ago
- Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…☆156Updated 7 months ago