arijitray1993 / awesome-spatial-reasoningLinks
Collection of the latest spatial, 3D, and video/temporal reasoning papers
☆27Updated 2 months ago
Alternatives and similar repositories for awesome-spatial-reasoning
Users that are interested in awesome-spatial-reasoning are comparing it to the libraries listed below
Sorting:
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆160Updated last month
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"☆22Updated last year
- A list of works on video generation towards world model☆222Updated this week
- Self-reimplemented version of 4D-LRM.☆63Updated 6 months ago
- From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…☆63Updated last month
- ☆100Updated 3 weeks ago
- Code implementation of the paper "World-in-World: World Models in a Closed-Loop World"☆98Updated this week
- [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.☆199Updated last month
- Repo for "Human-Centric Foundation Models: Perception, Generation and Agentic Modeling" (https://arxiv.org/abs/2502.08556)☆54Updated 9 months ago
- A paper list that includes world models or generative video models for embodied agents.☆25Updated 10 months ago
- Official repository of PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning☆53Updated last month
- OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models☆72Updated 2 months ago
- [NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆98Updated 3 weeks ago
- WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes☆104Updated 8 months ago
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆193Updated 6 months ago
- [Nips 2025] EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆122Updated 4 months ago
- "Comp4D: Compositional 4D Scene Generation", Dejia Xu*, Hanwen Liang*, Neel P. Bhatt, Hezhen Hu, Hanxue Liang, Konstantinos N. Platanioti…☆78Updated last year
- Program synthesis for 3D spatial reasoning☆53Updated 5 months ago
- [NeurIPS 2025 Spotlight] MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning☆66Updated 2 months ago
- Official implementation of Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling☆94Updated this week
- UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding☆56Updated 3 months ago
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆61Updated last year
- ☆47Updated 5 months ago
- Official implementation of ICCV 2025 paper "EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds".☆40Updated 5 months ago
- ☆153Updated 10 months ago
- [ARXIV’25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control☆85Updated 4 months ago
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆42Updated 11 months ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆78Updated 5 months ago
- [ECCV 2024] EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion.☆101Updated last year
- [ECCV 2024] Official Implementation of DragAPart: Learning a Part-Level Motion Prior for Articulated Objects.☆83Updated last year