☆49Jul 10, 2024Updated last year
Alternatives and similar repositories for video-physics-sound-diffusion
Users that are interested in video-physics-sound-diffusion are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)☆33Feb 11, 2026Updated 3 months ago
- Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".☆93Dec 8, 2023Updated 2 years ago
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 3 years ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation☆125Jan 18, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- MeshRIR: Dataset of room impulse responses on meshed grid points☆43Mar 13, 2026Updated 2 months ago
- ☆34Apr 10, 2023Updated 3 years ago
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- ☆15Sep 24, 2022Updated 3 years ago
- [CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'☆13Jun 16, 2024Updated last year
- Geometric-Wave Acoustic dataset☆64Aug 14, 2022Updated 3 years ago
- This repository holds datasets of polyphonic drum patterns used in the creation of Electronic Dance Music.☆16Dec 19, 2016Updated 9 years ago
- A large-scale real-world audio-visual dataset for research on 3D scene understanding and echolocation.☆22Oct 21, 2025Updated 7 months ago
- ☆21Mar 4, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- ☆19Jan 30, 2023Updated 3 years ago
- PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "☆39Dec 15, 2020Updated 5 years ago
- Sound synthesis with physical models - from http://taopm.sourceforge.net☆11Jun 13, 2020Updated 5 years ago
- [AAAI 2024] V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models☆29Dec 14, 2023Updated 2 years ago
- Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models☆203May 29, 2024Updated last year
- ☆16Sep 7, 2022Updated 3 years ago
- Diffusion-based Speech Enhancement: Demonstration of Performance and Generalization☆14Dec 21, 2024Updated last year
- [CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation☆46Sep 6, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICMR 2025] Official Repository for The Paper, Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale …☆18Aug 17, 2025Updated 9 months ago
- Spatial Audio Generation☆117Mar 24, 2023Updated 3 years ago
- Another rubberband-wasm story but with ready-to-use AudioWorklet and WebWorker☆17Nov 14, 2022Updated 3 years ago
- ☆37Mar 26, 2024Updated 2 years ago
- ☆18Jul 9, 2024Updated last year
- Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".☆291Mar 20, 2024Updated 2 years ago
- [CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation☆454Jun 5, 2024Updated last year
- 🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).☆19Apr 1, 2021Updated 5 years ago
- Utilities and experiments for training RAVE☆16Oct 23, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [NeurIPS 2024] Code, Dataset, Samples for the VATT paper “ Tell What You Hear From What You See - Video to Audio Generation Through Text”☆37Jul 24, 2025Updated 10 months ago
- Language-based navigation project☆22Feb 9, 2024Updated 2 years ago
- Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"☆17Jul 13, 2025Updated 10 months ago
- Sample based concatenative synthesizer for the NSynth dataset. Render any MIDI (.mid) sequence with the notes of NSynth.☆12Oct 4, 2023Updated 2 years ago
- This repository is for an implementation of the accepted paper "Sketching the Expression: Flexible Rendering of Expressive Piano Performa…☆22Dec 15, 2022Updated 3 years ago
- Curated list for papers, codes and resources related to Text-to-Audio (TTA) Generation☆72Jan 22, 2026Updated 4 months ago
- PyTorch implementation of "Lip to Speech Synthesis in the Wild with Multi-task Learning" (ICASSP2023)☆71Mar 9, 2024Updated 2 years ago