LiamLian0727 / Euclids_GiftLinks
This repo is the official implementation of "Euclid’s Gift: Enhancing Spatial Perception and Reasoning in Vision‑Language Models via Geometric Surrogate Tasks"
☆25Updated 2 months ago
Alternatives and similar repositories for Euclids_Gift
Users that are interested in Euclids_Gift are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆87Updated 5 months ago
- The official repository of our paper "Reinforcing Video Reasoning with Focused Thinking"☆33Updated 6 months ago
- We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that S…☆234Updated this week
- A collection of awesome think with videos papers.☆76Updated last month
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".☆78Updated 6 months ago
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆136Updated 2 weeks ago
- ☆65Updated 2 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆222Updated 5 months ago
- Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆102Updated 2 weeks ago
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT☆112Updated 2 months ago
- [NeurIPS'24] SpatialEval: a benchmark to evaluate spatial reasoning abilities of MLLMs and LLMs☆58Updated 11 months ago
- [NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration☆105Updated last month
- ☆112Updated 5 months ago
- ViewSpatial-Bench:Evaluating Multi-perspective Spatial Localization in Vision-Language Models☆66Updated 7 months ago
- Collections of Papers and Projects for Multimodal Reasoning.☆106Updated 8 months ago
- 【COLING 2025🔥】Code for the paper "Is Parameter Collision Hindering Continual Learning in LLMs?".☆37Updated last year
- Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025☆95Updated 9 months ago
- Incentivizing "Thinking with Long Videos" via Native Tool Calling☆166Updated this week
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆95Updated 3 months ago
- STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?☆34Updated 6 months ago
- ☆45Updated 4 months ago
- ☆116Updated 2 months ago
- ☆57Updated 4 months ago
- [LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)☆36Updated 8 months ago
- Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward☆57Updated last month
- [arxiv 2025] RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning☆36Updated 2 months ago
- [NeurIPS 2025] Official repository for “FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models”☆27Updated last month
- [CVPR2025] BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding☆36Updated 9 months ago
- ☆28Updated 11 months ago
- TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models☆64Updated last month