BurakCanBiner/SonicDiffusion

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/BurakCanBiner/SonicDiffusion)

BurakCanBiner / SonicDiffusion

☆43

Alternatives and similar repositories for SonicDiffusion

Users that are interested in SonicDiffusion are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kaist-ami / Sound2Scene
View on GitHub
☆42Apr 14, 2025Updated last year
guyyariv / AudioToken
View on GitHub
[InterSpeech 2023] The official PyTorch implementation of: "AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Imag…
☆89May 18, 2026Updated 2 months ago
kaist-ami / SoundBrush
View on GitHub
☆14Dec 8, 2025Updated 7 months ago
slliugit / slliugit.github.io
View on GitHub
music denoising network
☆16Sep 24, 2024Updated last year
naver-ai / rewas
View on GitHub
Official PyTorch implementation of ReWaS (AAAI'25) "Read, Watch and Scream! Sound Generation from Text and Video"
☆44Dec 13, 2024Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
yzxing87 / Seeing-and-Hearing
View on GitHub
[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
☆155Jul 6, 2024Updated 2 years ago
Tinglok / avstyle
View on GitHub
Codebase for the Paper: Learning Visual Styles from Audio-Visual Associations (ECCV 2022, in PyTorch)
☆15Jan 26, 2023Updated 3 years ago
GeWu-Lab / Stepping-Stones
View on GitHub
The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
☆18Oct 11, 2024Updated last year
WikiChao / VisAH
View on GitHub
[CVPR 2025] Pytorch implementation of the paper "Learning to Highlight Audio by Watching Movies"
☆15Oct 1, 2025Updated 9 months ago
heng-hw / V2A-Mapper
View on GitHub
[AAAI 2024] V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models
☆29Dec 14, 2023Updated 2 years ago
flying-sky999 / OmniV2V
View on GitHub
☆15Jun 2, 2025Updated last year
kuai-lab / sound-guided-semantic-image-manipulation
View on GitHub
Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)
☆80Aug 14, 2023Updated 2 years ago
taegyeong-lee / Generating-Realistic-Images-from-In-the-wild-Sounds
View on GitHub
Official Code Repository for the paper "Generating Realistic Images from In-the-wild Sounds", ICCV 2023
☆12Aug 24, 2025Updated 11 months ago
mcomunita / syncfusion
View on GitHub
SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis
☆19Jul 22, 2024Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
jnwnlee / video-foley
View on GitHub
Official implementation of "Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound". IEEE TASLP 20…
☆19Feb 27, 2026Updated 4 months ago
cfeng16 / GPS2Pix
View on GitHub
[CVPR 2025] GPS as a Control Signal for Image Generation
☆25Mar 18, 2025Updated last year
ankitkv / TD-VAE
View on GitHub
TD-VAE in PyTorch
☆10May 28, 2019Updated 7 years ago
kadircenk / WardMTBCuda
View on GitHub
☆12Jan 28, 2021Updated 5 years ago
wuyushuwys / FMEDiffusion
View on GitHub
[NeurIPS2024] Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
☆18Dec 3, 2024Updated last year
dieKarotte / Spatial-Omni
View on GitHub
☆28Jun 17, 2026Updated last month
codingrex / TimeRewind
View on GitHub
☆13Apr 28, 2025Updated last year
SitongGong / Veason-R1
View on GitHub
Official code of Veason-R1
☆15Jul 14, 2026Updated last week
UESTC-Med424-JYX / Diff-SFCT
View on GitHub
Diff-SFCT: A Diffusion Model with Spatial-Frequency Cross Transformer for Medical Image Segmentation
☆10Apr 15, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
HeliosZhao / ControlNet-Stable-UnCLIP
View on GitHub
☆12Apr 24, 2024Updated 2 years ago
imatge-upc / wav2pix
View on GitHub
Speech-conditioned face generation using Generative Adversarial Networks (ICASSP 2019)
☆57Feb 12, 2022Updated 4 years ago
LiuTingWed / Neural-Architecture-Search-PaperAndCode-Sunmary
View on GitHub
This repository collects recent NAS based methods and provide a summary (Paper and Code) by year and task. We hope this repo can help yo…
☆14Oct 6, 2022Updated 3 years ago
ku-vai / TPoS
View on GitHub
This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)
☆25Dec 7, 2023Updated 2 years ago
jiahanli2022 / confusion-GAN
View on GitHub
☆10Aug 9, 2025Updated 11 months ago
Exgc / OmniSep
View on GitHub
Sound Separation, Omni modal
☆29Sep 15, 2025Updated 10 months ago
mbzuai-oryx / Video-R2
View on GitHub
Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language Models
☆19Jan 21, 2026Updated 6 months ago
mikugyf / PMIQD-SIS
View on GitHub
"Blind Image Quality Assessment for Pathological Microscopic Image under Screen and Immersion Scenarios"
☆15Aug 29, 2023Updated 2 years ago
philippe-eecs / small-vision
View on GitHub
A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.
☆34Jun 26, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
bimsarapathiraja / refedit
View on GitHub
[ICCV 2025] Official Implementation of RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring …
☆20Jun 27, 2025Updated last year
ChengyinLee / FocalUNETR
View on GitHub
A Focal Transformer for Boundary-aware Prostate Segmentation using CT Images
☆11Nov 10, 2024Updated last year
shaoshitong / DiffuseExpand
View on GitHub
[IJCAI 2023 workshop]Expanding dataset for 2D medical image segmentation using diffusion models
☆15Feb 28, 2023Updated 3 years ago
ZeyueT / VidMuse
View on GitHub
[CVPR 2025] Repository of VidMuse
☆140Jun 7, 2025Updated last year
TFNTF / PostEdit
View on GitHub
Codes of PostEdit
☆24Apr 28, 2025Updated last year
UCSB-AI / via-video
View on GitHub
☆25May 12, 2026Updated 2 months ago
my-yy / sl_icmr2022
View on GitHub
Code for "Self-Lifting: A Novel Framework For Unsupervised Voice-Face Association Learning,ICMR,2022"
☆15Oct 25, 2024Updated last year