multimodal-art-projection/Open-Suno

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/multimodal-art-projection/Open-Suno)

multimodal-art-projection / Open-Suno

trying to reproduce suno v3

☆35

Alternatives and similar repositories for Open-Suno

Users that are interested in Open-Suno are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
ex3ndr / supervoice-librilight-preprocessed
View on GitHub
60k hours of phoneme-aligned audio from audio books
☆19Jul 27, 2024Updated 2 years ago
AbrahamSanders / codec-bpe
View on GitHub
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
☆76Dec 3, 2025Updated 7 months ago
KdaiP / DC-Speech-VAE
View on GitHub
5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
☆57Nov 19, 2025Updated 8 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆147Aug 18, 2024Updated last year
lmxue / Audio-FLAN
View on GitHub
Audio-FLAN
☆161Sep 23, 2025Updated 10 months ago
dysomni / Harmonizer
View on GitHub
JUCE audio plugin for realtime pitch shifting and voice duplication from MIDI keyboard input. Works differently than a vocoder as it can …
☆12Sep 14, 2021Updated 4 years ago
Eps-Acoustic-Revolution-Lab / EAR_HEAR
View on GitHub
☆15Jan 9, 2026Updated 6 months ago
xiquan-li / Awesome-Audio-Generation
View on GitHub
Curated list for papers, codes and resources related to Text-to-Audio (TTA) Generation
☆75Jul 20, 2026Updated last week
zhenye234 / xcodec
View on GitHub
AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
☆308Oct 12, 2025Updated 9 months ago
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆125Jun 4, 2025Updated last year
alibaba / vstyle
View on GitHub
☆34Sep 15, 2025Updated 10 months ago
mubtasimahasan / DM-Codec
View on GitHub
Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”
☆57Jun 1, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
fmu2 / flow-VAE
View on GitHub
Variational Autoencoder (VAE) with Normalizing Flows
☆71Oct 10, 2024Updated last year
OpenMOSS / MOSS-Audio-Tokenizer
View on GitHub
MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, i…
☆248Jun 16, 2026Updated last month
SparkAudio / SparkVox
View on GitHub
☆37Jun 9, 2025Updated last year
Soul-AILab / SAC
View on GitHub
[ACL 2026 Main] Training, inference, and testing of the SAC speech codec model.
☆108Nov 1, 2025Updated 8 months ago
redmist328 / APNet2
View on GitHub
Source code of APNet2, a vocoder
☆60Nov 23, 2023Updated 2 years ago
kyutai-labs / nanoGPTaudio
View on GitHub
Code for the blog "Neural audio codecs: how to get audio into LLMs"
☆174Oct 20, 2025Updated 9 months ago
RayYuki / CodecBench
View on GitHub
☆24Nov 16, 2025Updated 8 months ago
zcli-charlie / ZIQI-Eval
View on GitHub
ZIQI-Eval: A Music Evaluation Benchmark for Large Language Models
☆18Jul 23, 2024Updated 2 years ago
hertz-pj / SNAC-Vocos
View on GitHub
A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.
☆70Oct 28, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
xiaomi-research / dasheng-audiogen
View on GitHub
end-to-end text to audio scene generation model
☆50Jun 16, 2026Updated last month
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
XiaomiMiMo / MiMo-Audio-Tokenizer
View on GitHub
A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.
☆145Sep 19, 2025Updated 10 months ago
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆121Jun 21, 2026Updated last month
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 6 months ago
cpuimage / Tacotron-2
View on GitHub
Tensorflow implementation of DeepMind's Tacotron-2 (without wavenet)
☆11Jul 12, 2019Updated 7 years ago
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
boson-ai / EmergentTTS-Eval-public
View on GitHub
[NeurIPS' 25] Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.
☆226Dec 9, 2025Updated 7 months ago
wuzhiyue111 / Codec-Evaluation
View on GitHub
☆50Apr 5, 2026Updated 3 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
cmpute / audio-codec-benchmark
View on GitHub
Comprehensive quantitative comparison of lossless and lossy audio codecs
☆41Feb 11, 2023Updated 3 years ago
wuzhiyue111 / MLLM-paper-reading
View on GitHub
MutiModel paper reading (Visual, Audio)
☆22Nov 24, 2025Updated 8 months ago
zeyuxie29 / PicoAudio
View on GitHub
☆45Jan 13, 2025Updated last year
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
Eps-Acoustic-Revolution-Lab / DUO_TOK
View on GitHub
Official repository for “Duo-Tok: Dual-Track Semantic Music Tokenizer for Vocal–Accompaniment Generation.”
☆32Nov 26, 2025Updated 8 months ago
crypto-code / Music-Representation-Comparison
View on GitHub
This is the repo with the code to conduct a comparative analysis of different audio representation models.
☆11Aug 31, 2023Updated 2 years ago
MoonshotAI / Kimi-Audio-Evalkit
View on GitHub
☆169Nov 20, 2025Updated 8 months ago