Stability-AI/stable-audio-3

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Stability-AI/stable-audio-3)

Stability-AI / stable-audio-3

☆625

Alternatives and similar repositories for stable-audio-3

Users that are interested in stable-audio-3 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenMOSS / MOSS-Music
View on GitHub
MOSS-Music is an open-source music understanding model for targeting musical captioning, lyrics ASR, structural analysis, chord / key / t…
☆122May 9, 2026Updated 2 months ago
dada-bots / underfit
View on GitHub
Stable Audio LoRA Trainer of salty goodness
☆79Updated this week
ZacharyNovack / live-music-diffusion-models
View on GitHub
☆48May 22, 2026Updated 2 months ago
cwx-worst-one / WavTTS
View on GitHub
WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling
☆210Jun 6, 2026Updated last month
yanghaha0908 / WavCube
View on GitHub
Official code for "WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling"
☆62Jun 27, 2026Updated 3 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
xiquan-li / MeanAudio
View on GitHub
[ACL 2026 Main] MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
☆142Sep 2, 2025Updated 10 months ago
wx9Songs / MOSS-Music-Data-Pipeline
View on GitHub
☆44Apr 26, 2026Updated 3 months ago
SonyCSLParis / codicodec
View on GitHub
Encode and decode audio samples to/from continuous and discrete compressed representations!
☆121Nov 25, 2025Updated 8 months ago
Soul-AILab / SAC
View on GitHub
[ACL 2026 Main] Training, inference, and testing of the SAC speech codec model.
☆108Nov 1, 2025Updated 8 months ago
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆121Jun 21, 2026Updated last month
fundwotsai2001 / Text-to-Music_control_family
View on GitHub
Containing SOTA methods that follows time-varying conditions for Text-to-Music
☆24Jan 1, 2026Updated 6 months ago
xiaomi-research / dasheng-audiogen
View on GitHub
end-to-end text to audio scene generation model
☆50Jun 16, 2026Updated last month
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆125Jun 4, 2025Updated last year
facebookresearch / dacvae
View on GitHub
DACVAE
☆226Dec 22, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
juhayna-zh / AudioControlNet
View on GitHub
Official repository for the paper "Audio ControlNet for Fine-Grained Audio Generation and Editing".
☆77Feb 7, 2026Updated 5 months ago
facebookresearch / audiobox-aesthetics
View on GitHub
Unified automatic quality assessment for speech, music, and sound.
☆745Jun 5, 2025Updated last year
tencent-ailab / MuQ
View on GitHub
Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".
☆360Aug 4, 2025Updated 11 months ago
facebookresearch / FlowDec
View on GitHub
An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.
☆212Jun 22, 2026Updated last month
bovod-sjtu / HoliTok
View on GitHub
HoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding
☆39Jun 8, 2026Updated last month
SonyResearch / Woosh
View on GitHub
Public release of the Sound Effect Foundation model by Sony AI.
☆354May 21, 2026Updated 2 months ago
xiquan-li / Resonate
View on GitHub
[INTERSPEECH 2026] Pre-training, SFT, DPO and GRPO for Text-to-Audio Generation
☆48Apr 17, 2026Updated 3 months ago
kandinskylab / kvae-audio
View on GitHub
KVAE-Audio: a continuous full-band audio waveform autoencoder
☆101Updated this week
inclusionAI / Ming-omni-tts
View on GitHub
Ming-omni-tts: Simple and Efficient Unified Generation of Speech, Music, and Sound with Precise Control
☆263Feb 26, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Eps-Acoustic-Revolution-Lab / EAR_VAE
View on GitHub
[INTERSPEECH 2026] This is the official implementation for εar-VAE model including inference and evaluation parts, more details coming so…
☆88Feb 13, 2026Updated 5 months ago
moiseshorta / CoDiCodec-Flow
View on GitHub
Realtime audio generation model using Flow Matching DiT on CoDiCodec latents.
☆41May 30, 2026Updated last month
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
NVIDIA / audio-intelligence
View on GitHub
Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with syntheti…
☆137Mar 3, 2026Updated 4 months ago
Ruiqi-Yan / Awesome-Audio-Editing
View on GitHub
A curated list of models, benchmarks, tools and guides for audio editing
☆34Jul 7, 2026Updated 2 weeks ago
XiaomiMiMo / MiMo-Audio-Tokenizer
View on GitHub
A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.
☆145Sep 19, 2025Updated 10 months ago
k2-fsa / Flow2GAN
View on GitHub
Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation
☆145Mar 8, 2026Updated 4 months ago
NVIDIA / audio-flamingo
View on GitHub
PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models
☆1,161Dec 15, 2025Updated 7 months ago
xiaomi-research / dasheng-tokenizer
View on GitHub
State-of-the-art continious audio tokenization
☆40Mar 9, 2026Updated 4 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
zhaoyx239 / X-Translator
View on GitHub
☆25Updated this week
gyt1145028706 / XY-Tokenizer
View on GitHub
This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs
☆97Sep 19, 2025Updated 10 months ago
fundwotsai2001 / MuseControlLite
View on GitHub
MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners [ICML 2025]
☆68Jan 6, 2026Updated 6 months ago
minzwon / musicfm
View on GitHub
☆268Feb 14, 2024Updated 2 years ago
ASLP-lab / FlashTTS
View on GitHub
Fast Streaming TTS with MTP Acceleration and X-pred Mean Flow Distillation
☆67Jun 16, 2026Updated last month
facebookresearch / WavFlow
View on GitHub
MultiModal Audio Generation in Raw Waveform Space.
☆154May 26, 2026Updated 2 months ago
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 6 months ago