SeePhys / seephys-projectLinks

SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning

☆23

Alternatives and similar repositories for seephys-project

Users that are interested in seephys-project are comparing it to the libraries listed below

Sorting:

microsoft / visualization-of-thought
[NeurIPS 2024]Repos for "Visualization-of-Thought" dataset, construction code and evaluation.
☆30Updated 8 months ago
Wang-Xiaodong1899 / CVPR25-MLLM-Paper-List
🔥CVPR 2025 Multimodal Large Language Models Paper List
☆144Updated 3 months ago
yanghlll / ScalingNoise
☆37Updated 3 months ago
NOVAglow646 / LLM-MLLM-paper-list
关于LLM和Multimodal LLM的paper list
☆41Updated 2 weeks ago
SooLab / DDCOT
[NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
☆44Updated last year
zhyang2226 / OPA-DPO
[CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key
☆61Updated 3 weeks ago
tanhuajie / Reason-RFT
⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.
☆164Updated 2 weeks ago
Osilly / Awesome-Interleaving-Reasoning
Interleaving Reasoning: Next-Generation Reasoning Systems for AGI
☆69Updated this week
Open-DataFlow / Awesome_MLLMs_Reasoning
☆101Updated this week
Ruiyang-061X / Awesome-MLLM-Uncertainty
✨A curated list of papers on the uncertainty in multi-modal large language model (MLLM).
☆48Updated 2 months ago
zoedsy / awesome-science-agents
☆43Updated 7 months ago
deepcs233 / Visual-CoT
[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …
☆334Updated 6 months ago
forwchen / LLaVA-MoLE
☆9Updated last year
jungao1106 / ICoT
[CVPR' 25] Interleaved-Modal Chain-of-Thought
☆53Updated 2 months ago
zhangquanchen / VisRL
VisRL: Intention-Driven Visual Perception via Reinforced Reasoning
☆29Updated 2 weeks ago
yongliang-wu / ExploreCfg
[NeurIPS2023] Exploring Diverse In-Context Configurations for Image Captioning
☆40Updated 7 months ago
IntelLabs / lvlm-interpret
☆83Updated 3 months ago
THUNLP-MT / EscapeCraft
Official repo for EscapeCraft (an 3D environment for room escape) and benchmark MM-Escape
☆16Updated 3 weeks ago
Video-R1 / Awesome-Multimodal-Reasoning
Collections of Papers and Projects for Multimodal Reasoning.
☆105Updated 2 months ago
inFaaa / Multimodal-Roadmap-for-freshman
本项目用于Multimodal领域新手的学习路线，包括该领域的经典论文，项目及课程。旨在希望学习者在一定的时间内达到对这个领域有较为深刻的认知，能够自己进行的独立研究。
☆19Updated last year
MingyuJ666 / ProLLM
[COLM'24] We propose Protein Chain of Thought (ProCoT), which replicates the biological mechanism of signaling pathways as language promp…
☆63Updated 3 months ago
zhaochen0110 / Awesome_Think_With_Images
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆402Updated this week
mat-agent / MAT-Agent
☆46Updated last week
The-Martyr / Awesome-Multimodal-Reasoning
Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models
☆29Updated last week
Hui-design / Open-LLaVA-Video-R1
[LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)
☆29Updated last month
ncTimTang / AKS
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
☆73Updated 2 months ago
junyangwang0410 / Attention-LLaVA
A hot-pluggable tool for visualizing LLaVA's attention.
☆19Updated last year
mybearyZhang / TwoStageReason
Official implementation of ECCV 2024 paper: Take A Step Back: Rethinking the Two Stages in Visual Reasoning
☆14Updated 3 weeks ago
RupertLuo / VoCoT
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
☆66Updated 11 months ago
xinyan-cxy / MINT-CoT
☆44Updated 2 weeks ago