WikiChao/DAVIS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WikiChao/DAVIS)

WikiChao / DAVIS

[🏆 IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound Separation from Diverse Categories"

☆33

Alternatives and similar repositories for DAVIS

Users that are interested in DAVIS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

WikiChao / VisAH
View on GitHub
[CVPR 2025] Pytorch implementation of the paper "Learning to Highlight Audio by Watching Movies"
☆15Oct 1, 2025Updated 9 months ago
WikiChao / ZeroSep
View on GitHub
[NeurIPS 2025] Separate Anything in Audio with Zero Training
☆60Nov 3, 2025Updated 8 months ago
liangsusan-git / AV-NeRF
View on GitHub
[NeurIPS 2023] AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis
☆36Feb 15, 2024Updated 2 years ago
yunlong10 / Video-R4
View on GitHub
Reinforcing Text-Rich Video Reasoning with Visual Rumination
☆28Jun 5, 2026Updated last month
sony / mmaudiosep
View on GitHub
☆16Apr 30, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
jing-bi / awesome-M.LLM-reasoning
View on GitHub
☆20May 11, 2025Updated last year
jaeyeonkim99 / visage
View on GitHub
Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)
☆47Sep 10, 2025Updated 10 months ago
roudimit / MUSIC_dataset
View on GitHub
MUSIC Dataset from The Sound of Pixels (ECCV '18)
☆137Aug 12, 2022Updated 3 years ago
Exgc / OmniSep
View on GitHub
Sound Separation, Omni modal
☆29Sep 15, 2025Updated 10 months ago
YYX666660 / LAVSS
View on GitHub
Code for LAVSS: Location-Guided Audio-Visual Spatial Audio Separation
☆19Feb 25, 2025Updated last year
WikiChao / ScalingConcept
View on GitHub
☆24Nov 1, 2024Updated last year
yunlong10 / CAT-V
View on GitHub
[AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…
☆67Jan 27, 2026Updated 5 months ago
WikiChao / FreSca
View on GitHub
[CVPR 2025 GMCV] Test-Time Frequency Scaling: Instant Frequency Control for Any Diffusion Model
☆55May 31, 2025Updated last year
sony / CLIPSep
View on GitHub
☆43Feb 21, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
OpenNLPLab / TAVGBench
View on GitHub
Demo page of TAVGBench: Benchmarking Text to Audible-Video Generation
☆15Apr 7, 2025Updated last year
jinxiang-liu / UFE-AVS
View on GitHub
Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""
☆19Jul 7, 2024Updated 2 years ago
Bizilizi / VGGSounder
View on GitHub
VGGSounder, a multi-label audio-visual classification dataset with modality annotations.
☆17Jun 30, 2026Updated 3 weeks ago
FannyChao / AVS360_audiovisual_saliency_360
View on GitHub
Towards Audio-Visual Saliency Prediction for Omnidirectional Video with Spatial Audio
☆20Dec 28, 2021Updated 4 years ago
yingchengy / AVMOE
View on GitHub
[NeurIPS 2024] Mixture of Experts for Audio-Visual Learning
☆25Jan 19, 2025Updated last year
weiguoPian / AV-CIL_ICCV2023
View on GitHub
[ICCV 2023] Audio-Visual Class-Incremental Learning
☆35Sep 29, 2024Updated last year
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
xiaomi-research / acavcaps
View on GitHub
☆31Mar 27, 2026Updated 3 months ago
hmartelb / avlit
View on GitHub
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…
☆20Sep 1, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
partha2409 / DCASE2025_seld_baseline
View on GitHub
☆27May 27, 2025Updated last year
pedro-morgado / spatialaudiogen
View on GitHub
Spatial Audio Generation
☆117Mar 24, 2023Updated 3 years ago
junwenxiong / diff_sal
View on GitHub
Offical implemention of the paper DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
☆29May 26, 2024Updated 2 years ago
Ayews / M3Net
View on GitHub
The implementation of 'M3Net: Multilevel, Mixed and Multistage Attention Network for Salient Object Detection'.
☆12Apr 18, 2025Updated last year
hanghuacs / MMComposition
View on GitHub
☆17Jun 20, 2025Updated last year
clelouch / Awesome-Camouflaged-Object-Detection
View on GitHub
A list of camouflaged object detection papers, codes and datasets.
☆14Sep 8, 2023Updated 2 years ago
rxtan2 / AVSeT
View on GitHub
☆17Oct 2, 2023Updated 2 years ago
hxixixh / mix-and-localize
View on GitHub
☆23Mar 20, 2024Updated 2 years ago
dieKarotte / ASAudio
View on GitHub
☆59Oct 19, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
spkgyk / TDFNet
View on GitHub
Official code release for "TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion", accepted ICIST 2023
☆14Mar 17, 2024Updated 2 years ago
jianzongwu / betrayed-by-captions
View on GitHub
(ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
☆48Jul 18, 2024Updated 2 years ago
buptexplorers / OFB-VR
View on GitHub
☆12Mar 17, 2020Updated 6 years ago
GeWu-Lab / MMCosine_ICASSP23
View on GitHub
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆26May 18, 2023Updated 3 years ago
SAKi-77 / DiffStereo
View on GitHub
DiffStereo: End-to-End Mono-to-Stereo Audio Generation with Diffusion Transformer
☆15Apr 17, 2026Updated 3 months ago
maswang32 / latentfouriertransform
View on GitHub
☆32Apr 21, 2026Updated 3 months ago
HuMathe / sonoworld
View on GitHub
Official implementation of the CVPR 2026 paper "SonoWorld: From One Image to a 3D Audio-Visual Scene."
☆39Jul 6, 2026Updated 2 weeks ago