haonan3 / V1Links

V1: Toward Multimodal Reasoning by Designing Auxiliary Task

☆35

Alternatives and similar repositories for V1

Users that are interested in V1 are comparing it to the libraries listed below

Sorting:

NUS-TRAIL / NoisyRollout
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆84Updated 2 months ago
shiqichen17 / VLM_Merging
Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)
☆68Updated 2 months ago
RainBowLuoCS / DEEM
(ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.
☆37Updated last month
sail-sg / AnytimeReasoner
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆43Updated 3 weeks ago
YiyangZhou / CSR
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
☆77Updated last year
yihedeng9 / STIC
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
☆70Updated last year
bigai-nlco / LatentSeek
Official Repository of LatentSeek
☆56Updated 2 months ago
RUCBM / DeepCritic
Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"
☆32Updated last month
pipilurj / bootstrapped-preference-optimization-BPO
code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"
☆57Updated 11 months ago
shikiw / Modality-Integration-Rate
[ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…
☆104Updated last month
GAIR-NLP / thinking-with-generated-images
Doodling our way to AGI ✏️ 🖼️ 🧠
☆86Updated 2 months ago
Dongping-Chen / MLLM-Judge
[ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.
☆78Updated 5 months ago
LzVv123456 / VISTA
☆50Updated last week
ML-GSAI / Diffusion-LLM-Papers
A Collection of Papers on Diffusion Language Models
☆98Updated this week
Qinyu-Allen-Zhao / LVLM-LP
The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
☆31Updated 9 months ago
QingyangZhang / EMPO
EMPO, A Fully Unsupervised RLVR Method
☆56Updated this week
chenllliang / G1
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
☆77Updated 2 months ago
zhijie-group / Orthus
☆47Updated 2 months ago
TianyunYoung / Hallucination-Attribution
This repo contains the code for the paper "Understanding and Mitigating Hallucinations in Large Vision-Language Models via Modular Attrib…
☆24Updated 3 weeks ago
GaryStack / MMR-V
Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?
☆35Updated last month
Hongcheng-Gao / HAVEN
Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".
☆17Updated 2 months ago
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆79Updated 9 months ago
xuyige / SoftCoT
ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…
☆38Updated 2 months ago
Wang-ML-Lab / multimodal-needle-in-a-haystack
[NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models
☆48Updated 3 months ago
Liuziyu77 / MIA-DPO
Official implement of MIA-DPO
☆63Updated 6 months ago
GATECH-EIC / ACT
[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…
☆40Updated last year
MJ-Bench / MJ-Bench
Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"
☆46Updated 2 months ago
zwq2018 / Multi-modal-Self-instruct
The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…
☆83Updated 6 months ago
aiming-lab / ReAgent-V
☆27Updated 2 months ago
LJungang / Awesome-Omni-Large-Models-and-Datasets
🔥 Omni large models and datasets for understanding and generating multi-modalities.
☆15Updated 9 months ago