arijitray1993 / awesome-spatial-reasoningLinks
Collection of the latest spatial, 3D, and video/temporal reasoning papers
☆31Updated 4 months ago
Alternatives and similar repositories for awesome-spatial-reasoning
Users that are interested in awesome-spatial-reasoning are comparing it to the libraries listed below
Sorting:
- OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models☆78Updated last week
- STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?☆35Updated 2 weeks ago
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆167Updated 3 months ago
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆125Updated 2 months ago
- Official Implementation of "Geometrically-Constrained Agent for Spatial Reasoning"☆48Updated last month
- [Nips 2025] EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆126Updated 5 months ago
- A list of works on video generation towards world model☆330Updated 2 weeks ago
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆201Updated 8 months ago
- ☆118Updated 2 months ago
- [CVPR 2025🔥] Official codebase for "Global-Local Tree Search in VLMs for 3D Indoor Scene Generation"☆20Updated 9 months ago
- [ARXIV’25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control☆88Updated 6 months ago
- Code implementation of the paper "World-in-World: World Models in a Closed-Loop World"☆124Updated last month
- Official implementation of ICCV 2025 paper "EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds".☆45Updated 7 months ago
- SPAgent, a spatial intelligence agent designed to operate in the physical and spatial world.☆56Updated last week
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆43Updated last year
- The official implementation of work "REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment".☆122Updated last year
- [CVPR 2025] Program synthesis for 3D spatial reasoning☆54Updated 7 months ago
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"☆23Updated last year
- Code implementation of the paper 'FIction: 4D Future Interaction Prediction from Video'☆17Updated 10 months ago
- [ECCV 2024] Official Implementation of DragAPart: Learning a Part-Level Motion Prior for Articulated Objects.☆83Updated last year
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆103Updated 6 months ago
- From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…☆75Updated 3 weeks ago
- A paper list that includes world models or generative video models for embodied agents.☆26Updated last year
- ☆55Updated 9 months ago
- Unifying 2D and 3D Vision-Language Understanding☆119Updated 6 months ago
- ☆37Updated 3 weeks ago
- Code and data for UniEgoMotion (ICCV 2025)☆41Updated 2 months ago
- ☆48Updated last month
- Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation☆48Updated last year
- [ICCV 2025] Improving 3D Large Language Model via Robust Instruction Tuning☆66Updated 3 months ago