seilk / VisAttnSinkLinks
[ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models
☆30Updated 3 months ago
Alternatives and similar repositories for VisAttnSink
Users that are interested in VisAttnSink are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆34Updated 10 months ago
- cliptrase☆36Updated 9 months ago
- [ECCV 2024] Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models☆47Updated 10 months ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆34Updated last year
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding☆43Updated 4 months ago
- 🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".☆35Updated 2 months ago
- [CVPR 2025] Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Att…☆17Updated 3 months ago
- [CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos☆66Updated last month
- [CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding☆64Updated last month
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆88Updated 7 months ago
- ☆84Updated 2 months ago
- Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"☆38Updated 3 months ago
- Official pytorch implementation of "RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in Large Vision Language…☆11Updated 5 months ago
- ☆46Updated last month
- Official implementation of ResCLIP: Residual Attention for Training-free Dense Vision-language Inference☆37Updated 2 months ago
- [NeurIPS 2024] Mitigating Object Hallucination via Concentric Causal Attention☆56Updated 5 months ago
- [CVPRW-25 MMFM] Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite fo…☆48Updated 9 months ago
- [NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization☆105Updated last year
- Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision☆41Updated 2 months ago
- [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion☆41Updated last month
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆91Updated last week
- Official implementation of MC-LLaVA.☆27Updated 4 months ago
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆21Updated 5 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆74Updated 11 months ago
- The official pytorch implemention of our CVPR-2024 paper "MMA: Multi-Modal Adapter for Vision-Language Models".☆69Updated last month
- official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"☆21Updated last month
- [CVPR 2025] VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification☆29Updated 2 months ago
- [CVPR 2025] Few-shot Recognition via Stage-Wise Retrieval-Augmented Finetuning☆18Updated 2 months ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆43Updated 5 months ago
- ☆21Updated last year