☆46Dec 30, 2024Updated last year
Alternatives and similar repositories for VisVM
Users that are interested in VisVM are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆22Dec 8, 2024Updated last year
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆47Aug 21, 2024Updated last year
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated 11 months ago
- ☆17Jan 9, 2025Updated last year
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆91Apr 30, 2024Updated last year
- Multimodal RewardBench☆62Feb 21, 2025Updated last year
- ☆18Jun 10, 2025Updated 8 months ago
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆21Jan 29, 2025Updated last year
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…☆40May 26, 2025Updated 9 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆86Nov 10, 2024Updated last year
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆64Jul 22, 2025Updated 7 months ago
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆43Mar 11, 2025Updated 11 months ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆155Apr 30, 2024Updated last year
- A Self-Training Framework for Vision-Language Reasoning☆88Jan 23, 2025Updated last year
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆86Oct 26, 2025Updated 4 months ago
- ☆14Apr 14, 2025Updated 10 months ago
- E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models☆39Jan 5, 2026Updated last month
- [COLM'25] Official implementation of the Law of Vision Representation in MLLMs☆175Oct 6, 2025Updated 4 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆71Jul 13, 2025Updated 7 months ago
- ☆33Nov 18, 2025Updated 3 months ago
- MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research☆22Sep 23, 2025Updated 5 months ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Feb 9, 2026Updated 2 weeks ago
- ☆51May 11, 2025Updated 9 months ago
- Code, Data and Red Teaming for ZeroBench☆54Dec 23, 2025Updated 2 months ago
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆104Nov 9, 2024Updated last year
- ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding☆17Aug 8, 2025Updated 6 months ago
- ☆13Jul 10, 2024Updated last year
- (ACL 2025) MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale☆49Jun 4, 2025Updated 8 months ago
- ☆12Dec 4, 2024Updated last year
- Towards a Unified View on Visual Parameter-Efficient Transfer Learning☆26Oct 13, 2022Updated 3 years ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109May 27, 2025Updated 9 months ago
- [CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption 🔍☆47Jul 5, 2025Updated 7 months ago
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year
- (ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.☆31Aug 7, 2025Updated 6 months ago
- Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)☆21Jul 16, 2025Updated 7 months ago
- Official implementation of "Relational Proxies: Emergent Relationships as Fine-Grained Discriminators", NeurIPS 2022.☆14Feb 1, 2025Updated last year
- CoV: Chain-of-View Prompting for Spatial Reasoning☆51Jan 23, 2026Updated last month
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆25Oct 7, 2025Updated 4 months ago
- [EMNLP 2024] SURf: Teaching Large Vision-Language Models to Selectively Utilize Retrieved Information☆12Oct 11, 2024Updated last year