ElectricAlexis / NotaGen
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
☆940Updated 2 weeks ago
Alternatives and similar repositories for NotaGen:
Users that are interested in NotaGen are comparing it to the libraries listed below
- ☆540Updated last week
- InspireMusic: A Unified Framework for Music, Song, Audio Generation.☆1,060Updated last week
- TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching☆708Updated last month
- Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion☆1,456Updated 3 weeks ago
- YuE: Open Full-song Generation Foundation for the GPU Poor☆375Updated 2 months ago
- [CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis☆1,315Updated last week
- Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…☆994Updated last week
- Interface for OuteTTS models.☆1,178Updated last week
- OpenMusic: SOTA Text-to-music (TTM) Generation☆552Updated 2 months ago
- YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open☆4,820Updated 2 weeks ago
- ☆1,392Updated 3 weeks ago
- CVPR2025☆842Updated last month
- HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo☆1,325Updated this week
- first base model for full-duplex conversational audio☆1,731Updated 3 months ago
- ☆727Updated 2 months ago
- A text-to-speech (TTS) and Speech-to-Speech (STS) library built on Apple's MLX framework, providing efficient speech synthesis on Apple S…☆478Updated last week
- Thera: Aliasing-Free Arbitrary-Scale Super-Resolution with Neural Heat Fields☆754Updated 2 weeks ago
- Run Orpheus 3B Locally With LM Studio☆367Updated last month
- High-quality Text-to-Audio Generation with Efficient Diffusion Transformer☆263Updated last week
- A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats includ…☆371Updated last week
- Motion-Controllable Video Diffusion via Warped Noise☆864Updated 3 weeks ago
- SynCity: Training-Free Generation of 3D Worlds☆566Updated last week
- A Fast TTS Engine☆490Updated 3 months ago
- FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝☆570Updated 8 months ago
- OmniSVG is the first family of end-to-end multimodal SVG generators that leverage pre-trained Vision-Language Models (VLMs), capable of g…☆1,419Updated last week
- The official ElevenLabs MCP server☆565Updated last week
- Official implementations for paper: VACE: All-in-One Video Creation and Editing☆1,338Updated 2 weeks ago
- Official implementation of SVFR.☆792Updated 3 months ago
- Implementation of F5-TTS in MLX☆520Updated last month
- ☆658Updated this week