L-YeZhu/CDCD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/L-YeZhu/CDCD)

L-YeZhu / CDCD

[ICLR2023] Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation (CDCD).

☆163

Alternatives and similar repositories for CDCD

Users that are interested in CDCD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

L-YeZhu / D2M-GAN
View on GitHub
[ECCV2022] D2M-GAN for music generation from dance videos
☆85Aug 16, 2022Updated 3 years ago
cientgu / VQ-Diffusion
View on GitHub
☆487Jun 30, 2022Updated 4 years ago
SAGNIKMJR / ego-AV-spatial-correspondence
View on GitHub
[CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'
☆14Jun 16, 2024Updated 2 years ago
shlizee / RhythmicNet
View on GitHub
☆12Apr 30, 2025Updated last year
vtan05 / dmd
View on GitHub
Motion to Dance Music Generation using Latent Diffusion Model
☆23Dec 26, 2023Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
cyanbx / Frieren-V2A
View on GitHub
Implementation of Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching (NeurIPS'24)
☆62Apr 3, 2025Updated last year
YuanGongND / cav-mae
View on GitHub
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
☆292Mar 20, 2024Updated 2 years ago
NVlabs / denoising-diffusion-gan
View on GitHub
Tackling the Generative Learning Trilemma with Denoising Diffusion GANs https://arxiv.org/abs/2112.07804
☆758Dec 2, 2022Updated 3 years ago
cyj407 / VQ-I2I
View on GitHub
Vector Quantized Image-to-Image Translation (ECCV 2022)
☆79Nov 28, 2022Updated 3 years ago
wzk1015 / video-bgm-generation
View on GitHub
[ACM MM 2021 Best Paper Award] Video Background Music Generation with Controllable Music Transformer
☆326Jun 8, 2025Updated last year
konpatp / diffae
View on GitHub
Official implementation of Diffusion Autoencoders
☆968Sep 12, 2024Updated last year
haofanwang / visbeat3
View on GitHub
Python3 Implementation for 'Visual Rhythm and Beat' SIGGRAPH 2018
☆20May 31, 2022Updated 4 years ago
chuangg / Foley-Music
View on GitHub
PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "
☆39Dec 15, 2020Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
microsoft / VQ-Diffusion
View on GitHub
Official implementation of VQ-Diffusion
☆981Apr 17, 2024Updated 2 years ago
researchmm / MM-Diffusion
View on GitHub
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
☆453Jun 5, 2024Updated 2 years ago
arpitbansal297 / Cold-Diffusion-Models
View on GitHub
Official implementation of Cold-Diffusion for different transformations in pytorch.
☆1,136Oct 13, 2022Updated 3 years ago
WeilunWang / SinDiffusion
View on GitHub
Official Implementation of SinDiffusion: Learning a Diffusion Model from a Single Natural Image
☆306Dec 7, 2022Updated 3 years ago
batmanlab / MSPC
View on GitHub
Maximum Spatial Perturbation for Image-to-Image Translation (Official Implementation)
☆63Jul 3, 2022Updated 4 years ago
kpandey008 / DiffuseVAE
View on GitHub
Official implementation of "DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents"
☆384Sep 10, 2022Updated 3 years ago
drscotthawley / fad_pytorch
View on GitHub
Frechet Audio Distance evaluation in PyTorch
☆36Jun 9, 2023Updated 3 years ago
OpenGVLab / LORIS
View on GitHub
[ICML2023] Long-Term Rhythmic Video Soundtracker
☆63Jul 28, 2025Updated 11 months ago
plassma / symbolic-music-discrete-diffusion
View on GitHub
☆50Aug 21, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Xiaohao-Liu / Awesome-Vison2Audio
View on GitHub
A curated list of Vision (video/image) to Audio Generation
☆107Feb 10, 2026Updated 5 months ago
baofff / Analytic-DPM
View on GitHub
Code for the paper Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models (ICLR 2022 Outsta…
☆173May 18, 2022Updated 4 years ago
ChenWu98 / generative-visual-prompt
View on GitHub
[NeurIPS 2022] (Amortized) distributional control for pre-trained generative models
☆121Sep 4, 2023Updated 2 years ago
brian7685 / Multimodal-Clustering-Network
View on GitHub
ICCV 2021
☆34May 11, 2022Updated 4 years ago
sangyun884 / blur-diffusion
View on GitHub
Official PyTorch implementation of the paper Progressive Deblurring of Diffusion Models for Coarse-to-Fine Image Synthesis.
☆157Sep 11, 2022Updated 3 years ago
YeLuoSuiYou / openstorypp
View on GitHub
We introduce OpenStory++, a large-scale open-domain dataset focusing on enabling MLLMs to perform storytelling generation tasks.
☆18Aug 30, 2024Updated last year
zxxwxyyy / sonique
View on GitHub
Video Background Music Generation Using Unpaired Audio-Visual Data
☆33Oct 8, 2024Updated last year
PITI-Synthesis / PITI
View on GitHub
PITI: Pretraining is All You Need for Image-to-Image Translation
☆502Jun 2, 2024Updated 2 years ago
ETH-DISCO / blap
View on GitHub
Official repo for BLAP: Bootstrapping Language-Audio Pre-training for Music Captioning presented at ICASSP 2025
☆16Nov 18, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
DifanLiu / ASSET
View on GitHub
ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions (SIGGRAPH 2022 - Journal Track)
☆112May 25, 2022Updated 4 years ago
chrisfan-wc / Frido
View on GitHub
Research code for paper "Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis"
☆114Nov 13, 2024Updated last year
yunyikristy / CM-ACC
View on GitHub
Cross-model active contrastive coding
☆22Mar 17, 2021Updated 5 years ago
haoheliu / audioldm_eval
View on GitHub
This toolbox aims to unify audio generation model evaluation for easier comparison.
☆390Sep 29, 2024Updated last year
voletiv / mcvd-pytorch
View on GitHub
Official implementation of MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation (https://arxiv.org/abs/…
☆370Sep 22, 2022Updated 3 years ago
gwang-kim / DiffusionCLIP
View on GitHub
[CVPR 2022] Official PyTorch Implementation for DiffusionCLIP: Text-guided Image Manipulation Using Diffusion Models
☆867Mar 27, 2023Updated 3 years ago
luping-liu / PNDM
View on GitHub
The official implementation for Pseudo Numerical Methods for Diffusion Models on Manifolds (PNDM, PLMS | ICLR2022)
☆356Apr 25, 2023Updated 3 years ago