aimagelab / awesome-human-visual-attentionLinks
This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.
☆63Updated 8 months ago
Alternatives and similar repositories for awesome-human-visual-attention
Users that are interested in awesome-human-visual-attention are comparing it to the libraries listed below
Sorting:
- Official codebase for "Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention" (CVPR 2023)☆43Updated last year
- ☆16Updated 11 months ago
- Official repository for the paper "TempSAL - Uncovering Temporal Information for Deep Saliency Prediction" (CVPR 2023)☆15Updated 10 months ago
- [NeurIPS2022] Mind Reader: Reconstructing complex images from brain activities☆62Updated 3 years ago
- [CVPR 2023 & IJCV 2025] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation☆64Updated 6 months ago
- [CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".☆302Updated last year
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆101Updated last year
- CVPR 2024 "Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers"☆22Updated 7 months ago
- This is the official repository for the paper "Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction". ICCV …☆23Updated last month
- [WACV2025 Oral] SUM: Saliency Unification through Mamba for Visual Attention Modeling☆88Updated 5 months ago
- Official code for the paper "Predicting Human Scanpaths in Visual Question Answering"☆23Updated 4 years ago
- ☆85Updated 2 years ago
- Official code repo for TCLR: Temporal Contrastive Learning for Video Representation [CVIU-2022]☆41Updated last year
- Code for CVPR 2023 paper "SViTT: Temporal Learning of Sparse Video-Text Transformers"☆20Updated 2 years ago
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆137Updated 2 years ago
- ☆62Updated 2 years ago
- Learning Bottleneck Concepts in Image Classification (CVPR 2023)☆43Updated 2 years ago
- Code for "Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations" (CVPR 2024 Oral)☆18Updated last year
- ☆58Updated last month
- Composed Video Retrieval☆61Updated last year
- Code for CVPR2023 paper "Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies"☆18Updated 2 years ago
- Official repository for "Self-Supervised Video Transformer" (CVPR'22)☆108Updated last year
- ☆54Updated last year
- Official repo of the paper "Object-aware Gaze Target Detection" (ICCV 2023)☆45Updated last year
- [ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion☆196Updated 6 months ago
- Scanpath metrics in Python☆31Updated 4 years ago
- [ICLR 2024] FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition☆95Updated last year
- Plotting heatmaps with the self-attention of the [CLS] tokens in the last layer.☆50Updated 3 years ago
- [CVPR'23] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders☆84Updated last year
- [WACV 2024] Code release for "VEATIC: Video-based Emotion and Affect Tracking in Context Dataset"☆21Updated 2 weeks ago