🔥 [NeurIPS 2025] Official implementation of "Generate, but Verify: Reducing Visual Hallucination in Vision-Language Models with Retrospective Resampling (REVERSE)"
☆54Jan 22, 2026Updated last month
Alternatives and similar repositories for reverse_vlm
Users that are interested in reverse_vlm are comparing it to the libraries listed below
Sorting:
- 🔥 [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"☆26Feb 9, 2025Updated last year
- [arXiv 2025] "CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought"☆15Apr 3, 2025Updated 11 months ago
- Official Repository of VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents☆99Feb 2, 2026Updated last month
- The official implementation of Preference Data Reward-Augmentation.☆18May 1, 2025Updated 10 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- [ICLR 2026] Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing☆29Feb 6, 2026Updated last month
- (ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆59Jan 26, 2026Updated last month
- ☆34Oct 9, 2025Updated 4 months ago
- ☆14Sep 11, 2025Updated 5 months ago
- Code release for AccDiffusionV2 (TPAMI)☆35Nov 4, 2025Updated 4 months ago
- Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…☆13Jan 16, 2025Updated last year
- ☆13Apr 23, 2025Updated 10 months ago
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆14Mar 17, 2025Updated 11 months ago
- Toward Ambulatory Vision: Learning Visually-Grounded Active View Selection☆22Feb 5, 2026Updated last month
- [NeurIPS 2024] "Self-Calibrated Tuning of Vision-Language Models for Out-of-Distribution Detection"☆13Oct 28, 2024Updated last year
- [ICCV 2025] ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models☆49Jul 7, 2025Updated 8 months ago
- Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding"☆59Jan 23, 2026Updated last month
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models☆41Sep 30, 2024Updated last year
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆23Dec 8, 2024Updated last year
- This repository contains the code for the paper “Neuro-Symbolic Query Compiler”, accepted to the Findings of ACL 2025.☆16Oct 20, 2025Updated 4 months ago
- [ACL 2025] Can MLLMs Understand the Deep Implication Behind Chinese Images?☆20Oct 20, 2025Updated 4 months ago
- [ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆39Jun 14, 2025Updated 8 months ago
- Recursive Visual Programming (ECCV 2024)☆18Nov 20, 2024Updated last year
- [ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts☆17Mar 11, 2025Updated 11 months ago
- VHTest☆15Oct 31, 2024Updated last year
- ☆17Apr 9, 2025Updated 10 months ago
- Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronizat…☆21Jun 24, 2025Updated 8 months ago
- (ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"☆46Jul 1, 2025Updated 8 months ago
- [NeurIPS 2025] Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models☆64Nov 27, 2025Updated 3 months ago
- ☆43Jul 9, 2025Updated 7 months ago
- Cost-Sensitive Toolpath Agent for Multi-turn Image Editing☆26Mar 26, 2025Updated 11 months ago
- [ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation☆28Feb 25, 2025Updated last year
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆19Feb 14, 2025Updated last year
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 10 months ago
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆25Jun 4, 2025Updated 9 months ago
- Official Implementation of CODE☆17Sep 26, 2024Updated last year
- A tool to assist in the interpretation of learned features in sparse autoencoders (in particular the four SAE's trained by Joseph Bloom o…☆19Oct 4, 2024Updated last year
- ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]☆21Aug 21, 2025Updated 6 months ago
- [ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆57Sep 12, 2025Updated 5 months ago