seilk/VisAttnSink

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/seilk/VisAttnSink)

seilk / VisAttnSink

[ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models

☆116

Alternatives and similar repositories for VisAttnSink

Users that are interested in VisAttnSink are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

seilk / LocalizationHeads
View on GitHub
[CVPR 2025 Highlight] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
☆81Aug 31, 2025Updated 10 months ago
MICV-yonsei / STORM
View on GitHub
[CVPR 2025] Official Pytorch Code for Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synth…
☆15Jun 21, 2025Updated last year
saccharomycetes / mllms_know
View on GitHub
[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'
☆381Apr 20, 2025Updated last year
MICV-yonsei / LocalizationHeads
View on GitHub
[CVPR 2025] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
☆17Oct 4, 2025Updated 9 months ago
Lackel / AGLA
View on GitHub
[CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention
☆68Jul 16, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
GATECH-EIC / ACT
View on GitHub
[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…
☆45Jun 30, 2024Updated 2 years ago
ZhangqiJiang07 / middle_layers_indicating_hallucinations
View on GitHub
[CVPR 2025] Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Att…
☆84Oct 9, 2025Updated 9 months ago
Theia-4869 / CDPruner
View on GitHub
[NeurIPS 2025] Official code for paper: Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs.
☆106Sep 20, 2025Updated 10 months ago
mshukor / ima-lmms
View on GitHub
[NeurIPS2024] Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs
☆23Oct 15, 2024Updated last year
Sreyan88 / VDGD
View on GitHub
Code for ICLR 2025 Paper: Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
☆25May 7, 2025Updated last year
sail-sg / Attention-Sink
View on GitHub
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆164Jul 8, 2025Updated last year
zifuwan / ONLY
View on GitHub
[ICCV 2025] ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models
☆51Jul 7, 2025Updated last year
xmed-lab / TAM
View on GitHub
[ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs
☆190Dec 14, 2025Updated 7 months ago
NishilBalar / Awesome-LVLM-Hallucination
View on GitHub
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
☆325Feb 8, 2026Updated 5 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
rubato-yeong / RRM
View on GitHub
[NeurIPS 2025] Interpreting vision transformers via residual replacement model
☆21Nov 3, 2025Updated 8 months ago
shengtun / Eagle-Anomaly-Detection
View on GitHub
This is an official PyTorch implementation for "EAGLE: Expert-Augmented Attention Guidance for Tuning-Free Industrial Anomaly Detection i…
☆20Feb 24, 2026Updated 5 months ago
jinghan1he / VHR
View on GitHub
[ACL 2025] Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence
☆21Jun 10, 2025Updated last year
clemneo / llava-interp
View on GitHub
☆86Nov 5, 2024Updated last year
nickjiang2378 / test-time-registers
View on GitHub
[NeurIPS '25 Spotlight] Official Pytorch implementation of "Vision Transformers Don't Need Trained Registers"
☆184Sep 19, 2025Updated 10 months ago
nickjiang2378 / vlm-hallucinations
View on GitHub
[ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"
☆105Nov 30, 2025Updated 7 months ago
ustc-hyin / ClearSight
View on GitHub
Code for paper: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models
☆61Dec 18, 2024Updated last year
LALBJ / PAI
View on GitHub
[ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
☆171Nov 6, 2024Updated last year
Jorffy / NoteMR
View on GitHub
[CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".
☆26Jun 16, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
1zhou-Wang / MemVR
View on GitHub
[ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in…
☆171Sep 25, 2025Updated 10 months ago
shiqichen17 / AdaptVis
View on GitHub
Github repository for "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas" (ICML 2025)
☆76May 2, 2025Updated last year
ysj9909 / StAR
View on GitHub
[ECCV 2026] StAR: Segment Anything Reasoner
☆25Apr 2, 2026Updated 3 months ago
zjysteven / VLM-Visualizer
View on GitHub
Visualizing the attention of vision-language models
☆304Feb 28, 2025Updated last year
LaVi-Lab / AIM
View on GitHub
[ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"
☆65Oct 9, 2025Updated 9 months ago
wicai24 / DOOR-Alignment
View on GitHub
☆20Apr 7, 2025Updated last year
ZhangXin1997 / MICCAI-2022
View on GitHub
☆10Jun 14, 2022Updated 4 years ago
zjunlp / Deco
View on GitHub
[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
☆147Sep 11, 2025Updated 10 months ago
zhoujiahuan1991 / ICML2025-TCPA
View on GitHub
☆23May 8, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
zhangbaijin / From-Redundancy-to-Relevance
View on GitHub
[NAACL 2025 Oral] From redundancy to relevance: Enhancing explainability in multimodal large language models
☆130Jan 30, 2026Updated 5 months ago
Hongcheng-Gao / HAVEN
View on GitHub
Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".
☆25Oct 22, 2025Updated 9 months ago
Ziwei-Zheng / Nullu
View on GitHub
Code for paper: Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
☆63Mar 13, 2025Updated last year
lchen1019 / Align-TI
View on GitHub
[ICML 2026] Beyond Next-Token Alignment: Distilling Multimodal Large Language Models via Token Interactions
☆25Feb 11, 2026Updated 5 months ago
cvlab-kaist / VIRAL
View on GitHub
Official implementation of "VIRAL: Visual Representation Alignment for MLLMs".
☆162Sep 21, 2025Updated 10 months ago
cpathology / NucleiSegHE
View on GitHub
H&E ROI-Level and WSI-Level Nuclei Segmentation with HoVer-Net
☆10Jul 30, 2024Updated last year
OmriKaduri / vlm-interp
View on GitHub
Code for paper: "What’s in the Image? A Deep-Dive into the Vision of Vision Language Models" (CVPR 2025)
☆18May 1, 2025Updated last year