RetroCirce/MusicLDM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RetroCirce/MusicLDM)

RetroCirce / MusicLDM

The latent diffusion model for text-to-music generation.

☆187

Alternatives and similar repositories for MusicLDM

Users that are interested in MusicLDM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AMAAI-Lab / mustango
View on GitHub
Mustango: Toward Controllable Text-to-Music Generation
☆394Jun 2, 2025Updated last year
seungheondoh / lp-music-caps
View on GitHub
LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]
☆348Apr 8, 2024Updated 2 years ago
glory20h / VoiceLDM
View on GitHub
VoiceLDM: Text-to-Speech with Environmental Context
☆194Aug 9, 2024Updated last year
haoheliu / AudioLDM-training-finetuning
View on GitHub
AudioLDM training, finetuning, evaluation and inference.
☆304Dec 13, 2024Updated last year
zengchang233 / xiaoicesing2
View on GitHub
The source code for the paper XiaoiceSing2 (interspeech2023)
☆49Jan 15, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yluo42 / SRVQ
View on GitHub
Spherical residual vector quantization (SRVQ)
☆31Aug 25, 2024Updated last year
ilaria-manco / muscaps
View on GitHub
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)
☆85Dec 3, 2024Updated last year
Audio-AGI / WavJourney
View on GitHub
WavJourney: Compositional Audio Creation with LLMs
☆544Sep 28, 2023Updated 2 years ago
yizhilll / MERT
View on GitHub
Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".
☆481May 25, 2025Updated last year
mulab-mir / song-describer-dataset
View on GitHub
The Song Describer dataset is an evaluation dataset made of ~1.1k captions for 706 permissively licensed music recordings.
☆175Dec 22, 2023Updated 2 years ago
haoheliu / audioldm_eval
View on GitHub
This toolbox aims to unify audio generation model evaluation for easier comparison.
☆390Sep 29, 2024Updated last year
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
yangdongchao / UniAudio
View on GitHub
The Open Source Code of UniAudio
☆605Jul 22, 2024Updated 2 years ago
shansongliu / MU-LLaMA
View on GitHub
MU-LLaMA: Music Understanding Large Language Model
☆306Aug 18, 2025Updated 11 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
yoyolicoris / music-spectrogram-diffusion-pytorch
View on GitHub
☆88Jan 29, 2023Updated 3 years ago
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆126Mar 20, 2025Updated last year
habla-liaa / encodecmae
View on GitHub
Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'
☆101Jul 24, 2024Updated last year
0417keito / JEN-1-pytorch
View on GitHub
Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.…
☆55Jan 18, 2024Updated 2 years ago
minzwon / musicfm
View on GitHub
☆268Feb 14, 2024Updated 2 years ago
bfs18 / rfwave
View on GitHub
☆151Apr 25, 2025Updated last year
reppy4620 / vocoders
View on GitHub
My vocoder experiments
☆31Jul 26, 2025Updated 11 months ago
spotify-research / llark
View on GitHub
Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, an…
☆384May 30, 2024Updated 2 years ago
XinhaoMei / ACT
View on GitHub
Source code for the paper 'Audio Captioning Transformer'
☆56Jan 18, 2022Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
affige / genmusic_demo_list
View on GitHub
a list of demo websites for automatic music generation research
☆791Jul 4, 2026Updated 2 weeks ago
hugofloresgarcia / vampnet
View on GitHub
music generation with masked transformers!
☆357May 16, 2025Updated last year
haoheliu / AudioLDM2
View on GitHub
Text-to-Audio/Music Generation
☆2,635Sep 29, 2024Updated last year
seungheondoh / music-text-representation
View on GitHub
Toward Universal Text-to-Music-Retrieval (TTMR) [ICASSP23]
☆113Aug 12, 2023Updated 2 years ago
salu133445 / mmt
View on GitHub
Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)
☆155Mar 14, 2024Updated 2 years ago
walker-hyf / GPT-Talker
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆78Nov 1, 2024Updated last year
cwitkowitz / ss-mpe
View on GitHub
Code for the paper "Toward Fully Self-Supervised Multi-Pitch Estimation".
☆25Sep 27, 2025Updated 9 months ago
KyungsuKim42 / tokensynth
View on GitHub
The official implementation of TokenSynth (ICASSP 2025)
☆91Jun 24, 2026Updated 3 weeks ago
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
zhvng / open-musiclm
View on GitHub
Implementation of MusicLM, a text to music model published by Google Research, with a few modifications.
☆560Jun 3, 2023Updated 3 years ago
sony / bigvsan
View on GitHub
Pytorch implementation of BigVSAN
☆203Dec 9, 2025Updated 7 months ago
archinetai / audio-diffusion-pytorch
View on GitHub
Audio generation using diffusion models, in PyTorch.
☆2,097Jun 12, 2023Updated 3 years ago
tencent-ailab / MuCodec
View on GitHub
☆169Nov 22, 2024Updated last year
microsoft / fadtk
View on GitHub
A simple library for Fréchet Audio Distance (FAD) calculation
☆266Aug 22, 2025Updated 11 months ago
winddori2002 / TriAAN-VC
View on GitHub
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
☆146Jan 15, 2024Updated 2 years ago
YatingMusic / MusiConGen
View on GitHub
☆88Oct 20, 2024Updated last year