NVIDIA/diffusion-audio-restoration

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NVIDIA/diffusion-audio-restoration)

NVIDIA / diffusion-audio-restoration

Audio-to-Audio Schrodinger Bridges is a diffusion-based audio restoration model for bandwidth extension and inpainting.

☆146

Alternatives and similar repositories for diffusion-audio-restoration

Users that are interested in diffusion-audio-restoration are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
YangXusheng-yxs / CodecFormer_5Hz
View on GitHub
☆35Oct 23, 2025Updated 9 months ago
NVIDIA / audio-intelligence
View on GitHub
Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with syntheti…
☆137Mar 3, 2026Updated 4 months ago
jjunak-yun / FLowHigh_code
View on GitHub
[ICASSP 2025] "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"
☆118Jan 17, 2025Updated last year
Andong-Li-speech / BridgeVoC
View on GitHub
This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".
☆67Nov 5, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
JusperLee / Apollo
View on GitHub
Music repair method to convert lossy MP3 compressed music to lossless music.
☆396Aug 12, 2025Updated 11 months ago
smulelabs / smule-renaissance
View on GitHub
Official Repository of Smule Renaissance, Smule's Vocal Restoration Models
☆43Oct 27, 2025Updated 8 months ago
Lab-MSP / NaturalVoices
View on GitHub
☆33Oct 28, 2025Updated 8 months ago
bernardo-torres / linear-autoencoders
View on GitHub
Official code and pretrained models for Linear Consistency Autoencoders (Lin-CAE), a method to induce linearity in audio autoencoders via…
☆17Feb 12, 2026Updated 5 months ago
jakeoneijk / FlashSR_Inference
View on GitHub
☆78Jan 25, 2025Updated last year
Eps-Acoustic-Revolution-Lab / EAR_VAE
View on GitHub
[INTERSPEECH 2026] This is the official implementation for εar-VAE model including inference and evaluation parts, more details coming so…
☆88Feb 13, 2026Updated 5 months ago
flamed-tts / Flamed-TTS
View on GitHub
This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …
☆57Aug 9, 2025Updated 11 months ago
LAION-AI / emotion-annotations
View on GitHub
☆110Jul 15, 2026Updated last week
zhai-lw / SQCodec
View on GitHub
A lightweight audio codec based on a single quantizer
☆72Aug 15, 2025Updated 11 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
sh-lee97 / grafx
View on GitHub
GRAFX: An Open-Source Library for Audio Processing Graphs in PyTorch
☆139Jun 29, 2026Updated 3 weeks ago
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆126Mar 20, 2025Updated last year
eloimoliner / BABE2-music-restoration
View on GitHub
☆61Apr 22, 2024Updated 2 years ago
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆121Jun 21, 2026Updated last month
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
ZhikangNiu / A-DMA
View on GitHub
[INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"
☆67Jun 16, 2025Updated last year
woongzip1 / UniverSR
View on GitHub
Official implemtation of UniverSR (ICASSP 2026)
☆59Apr 9, 2026Updated 3 months ago
facebookresearch / FlowDec
View on GitHub
An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.
☆212Jun 22, 2026Updated last month
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Soul-AILab / SAC
View on GitHub
[ACL 2026 Main] Training, inference, and testing of the SAC speech codec model.
☆108Nov 1, 2025Updated 8 months ago
csteinmetz1 / st-ito
View on GitHub
Audio production style transfer with inference-time optimization
☆59Jul 17, 2026Updated last week
Aria-K-Alethia / BigCodec
View on GitHub
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
☆218Sep 19, 2024Updated last year
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
smulelabs / windowed-roformer
View on GitHub
Official Repository for "Efficient Vocal Source Separation Through Windowed RoFormer"
☆45Oct 30, 2025Updated 8 months ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
P1ping / TokAN-Legacy
View on GitHub
☆27Jun 22, 2026Updated last month
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
matthewmcq / upscalemp3_v2
View on GitHub
Mp3 to wav super resolution model for audio restoration & enhancement. U-Net + Discrete Wavelet Transform (DWT) Architecture
☆21Dec 1, 2025Updated 7 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
qiuqiangkong / audioflow
View on GitHub
☆130Updated this week
WangHelin1997 / SoloAudio
View on GitHub
SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.
☆121Jan 28, 2026Updated 5 months ago
primepake / F5-TTS-meanflow-multilingual
View on GitHub
Meanflow and multilingual for F5-TTS model
☆16Aug 23, 2025Updated 11 months ago
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
yangdongchao / ALMTokenizer2
View on GitHub
The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…
☆45Sep 5, 2025Updated 10 months ago
yxlu-0102 / AP-BWE
View on GitHub
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
☆194Apr 15, 2025Updated last year
luotianze666 / WaveFM
View on GitHub
[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
☆133Apr 8, 2026Updated 3 months ago