fscdc / ReasonMapLinks
[arXiv 2025] Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
☆72Updated 3 weeks ago
Alternatives and similar repositories for ReasonMap
Users that are interested in ReasonMap are comparing it to the libraries listed below
Sorting:
- ☆34Updated 8 months ago
- [NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing☆90Updated 6 months ago
- [ICCV 2025] Official code for paper: Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs☆68Updated 7 months ago
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆103Updated 6 months ago
- Visual Spatial Tuning☆171Updated this week
- STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?☆35Updated 3 weeks ago
- [NeurIPS 2025] EOC-Bench, an innovative benchmark designed to systematically evaluate object-centric embodied cognition in dynamic egocen…☆22Updated 7 months ago
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO☆78Updated 2 months ago
- ☆41Updated 7 months ago
- ☆22Updated 8 months ago
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆25Updated 8 months ago
- [NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning☆38Updated 3 months ago
- [AAAI 2024] Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-Supervised 3D Object Detection☆10Updated last year
- ☆117Updated 2 weeks ago
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…☆64Updated last week
- officical code for ECCV 2024 paper "Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection"☆14Updated last year
- Official Implementation of "Geometrically-Constrained Agent for Spatial Reasoning"☆48Updated last month
- The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (arXiv 2025)☆39Updated 8 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆117Updated 10 months ago
- ☆68Updated 3 months ago
- [NeurIPS 2025] Official repository for “FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models”☆28Updated last month
- Official implementation of NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments (ICCV'25).☆66Updated last month
- Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence☆426Updated 2 weeks ago
- Code of 3DMIT: 3D MULTI-MODAL INSTRUCTION TUNING FOR SCENE UNDERSTANDING☆31Updated last year
- [ICLR 2026] OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models☆79Updated 2 weeks ago
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT☆117Updated last week
- ☆35Updated 3 months ago
- [CVPR'2025] EntitySAM: Segment Everything in Video☆60Updated 6 months ago
- [IJCV 2024]☆19Updated last year
- From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…☆75Updated last month