junwenxiong/diff_sal

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/junwenxiong/diff_sal)

junwenxiong / diff_sal

Offical implemention of the paper DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction

☆29

Alternatives and similar repositories for diff_sal

Users that are interested in diff_sal are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MengkeSong / SCDL
View on GitHub
☆12Jan 26, 2023Updated 3 years ago
IVRL / AugSal
View on GitHub
This is the GitHub repository for Data Augmentation for Saliency Prediction via Latent Diffusion paper in ECCV 2024, Milano, Italy
☆15Nov 7, 2024Updated last year
MinglangQiao / awesome-salient-object-ranking
View on GitHub
A curated list of awesome resources for salient object ranking (SOR)
☆17Sep 28, 2025Updated 9 months ago
wusonghe / TMFI-Net
View on GitHub
☆28Jun 6, 2023Updated 3 years ago
come880412 / STSANet
View on GitHub
The pytorch implementation of STSANet (non-official)
☆11Feb 14, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
IgnatPolezhaev / MDS-ViTNet
View on GitHub
We present a novel methodology we call MDS-ViTNet (Multi Decoder Saliency by Vision Transformer Network) for enhancing visual saliency pr…
☆17Jan 18, 2025Updated last year
chenxy99 / GazeXplain
View on GitHub
[ECCV 2024 Oral] GazeXplain - Official PyTorch Implementation
☆17Feb 24, 2025Updated last year
guotaowang / STANet
View on GitHub
☆16Sep 20, 2022Updated 3 years ago
samyak0210 / ViNet
View on GitHub
ViNet Pushing the limits of Visual Modality for Audio Visual Saliency Prediction
☆75Jul 29, 2025Updated 11 months ago
msu-video-group / ECCVW24_Saliency_Prediction
View on GitHub
ECCV-AIM 2024 Challenge on Video Saliency Prediction
☆32Sep 24, 2024Updated last year
cpf0079 / UCDA
View on GitHub
Source codes for "Unsupervised Curriculum Domain Adaptation for No-Reference Video Quality Assessment"
☆20Dec 19, 2021Updated 4 years ago
guobaoxiao / DSAM
View on GitHub
Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection, ACM Multimedia (MM), 2024
☆25Oct 15, 2024Updated last year
perceivelab / hd2s
View on GitHub
The official PyTorch implementation for paper "Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction"
☆27Mar 13, 2023Updated 3 years ago
msu-video-group / NTIRE26_Saliency_Prediction
View on GitHub
CVPR-NTIRE 2026 Challenge on Video Saliency Prediction
☆17Mar 20, 2026Updated 4 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
feiyanhu / tinyHD
View on GitHub
☆20Mar 6, 2023Updated 3 years ago
lxa9867 / QSD
View on GitHub
[CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"
☆12Feb 27, 2024Updated 2 years ago
cozcinar / 360_Audio_Visual_ICMEW2020
View on GitHub
Audio-Visual Perception of Omnidirectional Video for Virtual Reality Applications
☆15Feb 22, 2023Updated 3 years ago
EricDengbowen / QAGNet
View on GitHub
Official repository for CVPR 2024 paper "Advancing Saliency Ranking with Human Fixations: Dataset, Models and Benchmarks".
☆21Jun 21, 2024Updated 2 years ago
WikiChao / DAVIS
View on GitHub
[🏆 IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound …
☆33Mar 30, 2026Updated 3 months ago
yingchengy / AVMOE
View on GitHub
[NeurIPS 2024] Mixture of Experts for Audio-Visual Learning
☆25Jan 19, 2025Updated last year
Linardos / SalEMA
View on GitHub
Simple vs complex temporal recurrences for video saliency prediction (BMVC 2019)
☆27Nov 22, 2022Updated 3 years ago
yongliang-wu / Repurpose
View on GitHub
[AAAI2025] Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark
☆31Apr 4, 2026Updated 3 months ago
ruohaoguo / avis
View on GitHub
[CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".
☆49Jun 5, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
zjuruizhechen / TVG-R1
View on GitHub
[EMNLP 2025 Industry] Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning
☆36Oct 22, 2025Updated 8 months ago
ibribr / ML
View on GitHub
IE500618 Machine Learning Course
☆14Nov 7, 2020Updated 5 years ago
jinxiang-liu / anno-free-AVS
View on GitHub
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
☆38Oct 11, 2024Updated last year
FannyChao / AVS360_audiovisual_saliency_360
View on GitHub
Towards Audio-Visual Saliency Prediction for Omnidirectional Video with Spatial Audio
☆20Dec 28, 2021Updated 4 years ago
MertCokelek / SalViT360
View on GitHub
Official PyTorch implementation of our paper "Spherical Vision Transformer for 360° Video Saliency Prediction" (BMVC 2023)
☆24Mar 27, 2024Updated 2 years ago
clh124 / VQAThinker
View on GitHub
[AAAI 2026] Official Code for VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning
☆30Nov 28, 2025Updated 7 months ago
xinyuguo1566 / PriorVLA
View on GitHub
Official implementation of PriorVLA.
☆17May 11, 2026Updated 2 months ago
PanchengZhao / CD3AL
View on GitHub
A large-scale dataset for classification and detection of apple leaf diseases
☆14Apr 1, 2023Updated 3 years ago
jianzongwu / betrayed-by-captions
View on GitHub
(ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
☆48Jul 18, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
nku-zhichengzhang / MPOT
View on GitHub
[ICCV 2023] This is the official implementation of "Multiple Planar Object Tracking"
☆24Aug 19, 2023Updated 2 years ago
zhangkao / IIP_STRNN_Saliency
View on GitHub
A Spatial-Temporal Recurrent Neural Network for Video Saliency Prediction (TIP2021)
☆13Jul 7, 2022Updated 4 years ago
GeWu-Lab / Crab
View on GitHub
[CVPR 2025] Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
☆85Dec 24, 2025Updated 6 months ago
EdenGabriel / TaskWeave
View on GitHub
[CVPR 2024 Accepted] TaskWeave: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection
☆30Sep 26, 2024Updated last year
SHI-Labs / Slow-Fast-Video-Multimodal-LLM
View on GitHub
☆29Apr 8, 2025Updated last year
kuai-lab / soundini-official
View on GitHub
We are committing code.
☆44May 18, 2023Updated 3 years ago
MemoonaTahira / CrowdFix
View on GitHub
IEEE published Eye-tracking dataset of Human Eye Fixations over Crowd Videos. This dataset is part of the MIT/Tübingen Saliency Benchmark…
☆11Jun 13, 2023Updated 3 years ago