LzVv123456 / VISTAView external linksLinks
☆72Jul 28, 2025Updated 6 months ago
Alternatives and similar repositories for VISTA
Users that are interested in VISTA are comparing it to the libraries listed below
Sorting:
- [EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information☆12Oct 11, 2024Updated last year
- ☆22Mar 12, 2025Updated 11 months ago
- ☆40May 24, 2024Updated last year
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 4 months ago
- [ICCV 2025] ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models☆49Jul 7, 2025Updated 7 months ago
- [ICLR 2025] Code for Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models☆24Apr 14, 2025Updated 10 months ago
- The official repo for "Where do Large Vision-Language Models Look at when Answering Questions?"☆56Jan 7, 2026Updated last month
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆133Sep 11, 2025Updated 5 months ago
- [ICML 2023] Taxonomy-Structured Domain Adaptation☆12Oct 6, 2023Updated 2 years ago
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆19Feb 14, 2025Updated last year
- ☆18Jun 10, 2025Updated 8 months ago
- ☆21Jun 5, 2025Updated 8 months ago
- Code for paper: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models☆52Dec 18, 2024Updated last year
- Code for Reducing Hallucinations in Vision-Language Models via Latent Space Steering☆103Nov 23, 2024Updated last year
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models☆75May 31, 2025Updated 8 months ago
- ☆62Jun 16, 2023Updated 2 years ago
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆47Aug 21, 2024Updated last year
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆26Oct 17, 2024Updated last year
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inference☆10Dec 15, 2024Updated last year
- ☆12Oct 7, 2024Updated last year
- Official Repository of LatentSeek☆76Jun 6, 2025Updated 8 months ago
- [ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs☆163Nov 6, 2024Updated last year
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆104Nov 9, 2024Updated last year
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆54May 3, 2025Updated 9 months ago
- [NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models☆34Nov 13, 2024Updated last year
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆61Jul 16, 2024Updated last year
- [NeurIPS 2023] A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm☆51Jan 26, 2025Updated last year
- ☆13Feb 24, 2025Updated 11 months ago
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆95Nov 30, 2025Updated 2 months ago
- Bayesian Low-Rank Adaptation of LLMs: BLoB [NeurIPS 2024] and TFB [NeurIPS 2025]☆33Feb 4, 2026Updated last week
- Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs☆32Sep 21, 2025Updated 4 months ago
- The author's implementation of FUDOKI, a multimodal large language model purely based on discrete flow matching.☆68Dec 21, 2025Updated last month
- [CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding☆378Oct 7, 2024Updated last year
- A library of visualization tools for the interpretability and hallucination analysis of large vision-language models (LVLMs).☆41May 22, 2025Updated 8 months ago
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆66Jun 10, 2025Updated 8 months ago
- [ACL 2025] Can MLLMs Understand the Deep Implication Behind Chinese Images?☆20Oct 20, 2025Updated 3 months ago
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆22Dec 8, 2024Updated last year
- Implementation for the paper 'Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport' (ICL…☆18Jan 1, 2025Updated last year
- ☆21Jul 25, 2025Updated 6 months ago