☆48Jul 10, 2024Updated last year
Alternatives and similar repositories for video-physics-sound-diffusion
Users that are interested in video-physics-sound-diffusion are comparing it to the libraries listed below
Sorting:
- The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)☆33Feb 11, 2026Updated 3 weeks ago
- ☆21Mar 4, 2024Updated 2 years ago
- Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".☆93Dec 8, 2023Updated 2 years ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 3 years ago
- [CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'☆13Jun 16, 2024Updated last year
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- ☆33Apr 10, 2023Updated 2 years ago
- End-to-end realization of HumanNeRF☆15Sep 5, 2023Updated 2 years ago
- ☆13Jul 14, 2024Updated last year
- This repository holds datasets of polyphonic drum patterns used in the creation of Electronic Dance Music.☆14Dec 19, 2016Updated 9 years ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆41Dec 23, 2023Updated 2 years ago
- ☆19Jan 30, 2023Updated 3 years ago
- ☆16Sep 7, 2022Updated 3 years ago
- A large-scale real-world audio-visual dataset for research on 3D scene understanding and echolocation.☆19Oct 21, 2025Updated 4 months ago
- MeshRIR: Dataset of room impulse responses on meshed grid points☆43Mar 23, 2024Updated last year
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆18Oct 11, 2024Updated last year
- Source Separation on Musical Instrument Sounds☆38Jan 4, 2022Updated 4 years ago
- [CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation☆45Sep 6, 2024Updated last year
- Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation☆125Jan 18, 2023Updated 3 years ago
- ☆18Nov 22, 2024Updated last year
- Spatial Audio Generation☆117Mar 24, 2023Updated 2 years ago
- PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "☆40Dec 15, 2020Updated 5 years ago
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆52Dec 5, 2024Updated last year
- Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"☆17Jul 13, 2025Updated 7 months ago
- 🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).☆19Apr 1, 2021Updated 4 years ago
- ☆20Dec 29, 2024Updated last year
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning☆20Dec 21, 2023Updated 2 years ago
- Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".☆287Mar 20, 2024Updated last year
- [CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation☆452Jun 5, 2024Updated last year
- Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models☆200May 29, 2024Updated last year
- Applying reinforcement learning to perform source separation.☆23Nov 25, 2020Updated 5 years ago
- Language-based navigation project☆22Feb 9, 2024Updated 2 years ago
- ☆22Mar 20, 2024Updated last year
- Geometric-Wave Acoustic dataset☆64Aug 14, 2022Updated 3 years ago
- Discrete wavelet transform layers with fixed and trainable wavelets☆22Nov 27, 2022Updated 3 years ago
- Neural IIR Filter Field for HRTF Upsampling and Personalization☆27Feb 26, 2024Updated 2 years ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago