WikiChao / DAVISLinks
[π IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound Separation from Diverse Categories"
β25Updated 2 months ago
Alternatives and similar repositories for DAVIS
Users that are interested in DAVIS are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2023] AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesisβ32Updated last year
- [ECCV 2024 Oral] Audio-Synchronized Visual Animationβ58Updated last year
- Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videosβ25Updated last year
- β34Updated last month
- β40Updated 8 months ago
- Towards training VQ-VAE models robustly!β91Updated 5 months ago
- [Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptionsβ33Updated 10 months ago
- [CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Alignersβ154Updated last year
- [CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separationβ70Updated 2 years ago
- β140Updated last year
- ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformerβ39Updated last year
- β41Updated last year
- [ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/Toβ¦β150Updated 5 months ago
- β10Updated 3 weeks ago
- Demo page of TAVGBench: Benchmarking Text to Audible-Video Generationβ14Updated 8 months ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotationβ41Updated 2 years ago
- β58Updated last year
- Official code for the paper "Understanding Co-speech Gestures in-the-wild"β20Updated 2 months ago
- β187Updated last year
- [ICCV 2025] Official Implementation of "Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation". Junyu Xie, Tengda Hβ¦β20Updated 5 months ago
- Download scripts and tools for Replay dataset.β36Updated 2 years ago
- β32Updated 5 months ago
- Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generationβ61Updated 6 months ago
- [CVPR'23 Highlight] AutoAD: Movie Description in Context.β102Updated last year
- This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)β25Updated 2 years ago
- Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"β163Updated 11 months ago
- [ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, Mβ¦β27Updated 11 months ago
- FQGAN: Factorized Visual Tokenization and Generationβ57Updated 9 months ago
- [ICCV 2023] Online Clustered Codebookβ181Updated last year
- β17Updated 2 years ago