STARE-bench / STARELinks
☆15Updated 2 months ago
Alternatives and similar repositories for STARE
Users that are interested in STARE are comparing it to the libraries listed below
Sorting:
- ☆112Updated 3 months ago
- Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)☆61Updated 8 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆103Updated this week
- 关于LLM和Multimodal LLM的paper list☆50Updated 2 weeks ago
- Code release for VTW (AAAI 2025 Oral)☆65Updated last month
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".☆77Updated 5 months ago
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆130Updated last week
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆228Updated 2 months ago
- Official codebase for the paper Latent Visual Reasoning☆69Updated 2 months ago
- ☆36Updated 4 months ago
- 🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.☆114Updated this week
- [ICML 2025 Oral] The official repository for the paper "Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchma…☆69Updated 5 months ago
- MAT: Multi-modal Agent Tuning 🔥 ICLR 2025 (Spotlight)☆79Updated 2 weeks ago
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Larg…☆35Updated 3 weeks ago
- ☆66Updated 5 months ago
- Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"☆73Updated 3 months ago
- Official repository for “Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space”☆15Updated 2 months ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆106Updated last year
- Code for paper: Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection☆48Updated 9 months ago
- [NeurIPS 2025] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains☆66Updated 5 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆102Updated 3 months ago
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆93Updated 3 months ago
- A hot-pluggable tool for visualizing LLaVA's attention.☆24Updated last year
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆103Updated last year
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Cont…☆67Updated 3 months ago
- [ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos☆101Updated 3 weeks ago
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models☆73Updated 7 months ago
- A Collection of Papers on Diffusion Language Models☆149Updated 3 months ago
- up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources☆238Updated 2 months ago
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆127Updated 3 months ago