phuselab / tppgazeLinks
☆11Updated 3 months ago
Alternatives and similar repositories for tppgaze
Users that are interested in tppgaze are comparing it to the libraries listed below
Sorting:
- This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, h…☆53Updated 3 weeks ago
- Official codebase for "Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention" (CVPR 2023)☆36Updated last year
- [CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering☆33Updated 2 months ago
- Official code for the paper "Predicting Human Scanpaths in Visual Question Answering"☆21Updated 4 years ago
- [CVPR 2023] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation☆61Updated 3 months ago
- [BMVC 2024 Oral ✨] Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization☆18Updated 8 months ago
- [ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval☆38Updated last month
- CVPR 2024 "Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers"☆19Updated 8 months ago
- [BMVC 2023] Zero-shot Composed Text-Image Retrieval☆54Updated 6 months ago
- The official implementation for Candidate Set Re-ranking for Composed Image Retrieval (TMLR) 01/2024☆19Updated last year
- With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning. ICCV 2023☆18Updated last year
- Improving neural network representations using human similarity judgments☆13Updated 6 months ago
- Official implementation of "Test-Time Zero-Shot Temporal Action Localization", CVPR 2024☆60Updated 8 months ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆83Updated last year
- Composed Video Retrieval☆57Updated last year
- Official repo of the paper "Object-aware Gaze Target Detection" (ICCV 2023)☆41Updated 6 months ago
- Official repository for the paper "TempSAL - Uncovering Temporal Information for Deep Saliency Prediction" (CVPR 2023)☆14Updated 2 months ago
- [ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts☆12Updated 4 months ago
- [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion☆45Updated last month
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆66Updated 9 months ago
- [ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion☆173Updated last year
- (ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning☆27Updated 8 months ago
- This repo contains the code of "Contrastive Supervised Distillation for Continual Representation Learning", Tommaso Barletti, Niccolò Bio…☆20Updated 2 years ago
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"☆21Updated 4 months ago
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆57Updated 11 months ago
- ☆25Updated 9 months ago
- ☆61Updated last year
- The official implementation of ECCV2024 paper "Facial Affective Behavior Analysis with Instruction Tuning"☆26Updated 4 months ago
- Code implementation of our paper: On Large Multimodal Models as Open-World Image Classifiers☆18Updated 2 months ago
- [CVPR 2025] Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval☆15Updated 2 months ago