arijitray1993/awesome-spatial-reasoning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/arijitray1993/awesome-spatial-reasoning)

arijitray1993 / awesome-spatial-reasoning

Collection of the latest spatial, 3D, and video/temporal reasoning papers

☆36

Alternatives and similar repositories for awesome-spatial-reasoning

Users that are interested in awesome-spatial-reasoning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lif314 / Awesome-Spatial-Intelligence
View on GitHub
Awesome Spatial Intelligence (Personal Use)
☆54Jan 7, 2026Updated 6 months ago
arijitray1993 / SAT
View on GitHub
Spatial Aptitude Training for Multimodal Langauge Models
☆33Feb 8, 2026Updated 5 months ago
mengcaopku / SpatialDreamer
View on GitHub
SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery
☆15Feb 1, 2026Updated 5 months ago
Yui010206 / Adaptive-Visual-Imagination-Control
View on GitHub
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning
☆18Jun 2, 2026Updated last month
yunfeixie233 / ViGaL
View on GitHub
☆70Feb 4, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
mll-lab-nu / Awesome-Spatial-Intelligence-in-VLM
View on GitHub
A paper list for spatial reasoning
☆766Jan 19, 2026Updated 6 months ago
shiqichen17 / AdaptVis
View on GitHub
Github repository for "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas" (ICML 2025)
☆76May 2, 2025Updated last year
3DLLM-Mem / 3DLLM-Mem
View on GitHub
☆27Jun 5, 2025Updated last year
Blazedengcy / GTASR
View on GitHub
ICML 2026 - Joint Geometric and Trajectory Consistency Learning for One-Step Real-World Super-Resolution (GTASR)
☆15Jun 15, 2026Updated last month
Mwxinnn / UniAS
View on GitHub
The official repo for ”[WACV2025] Towards Accurate Unified Anomaly Segmentation“
☆15Apr 14, 2025Updated last year
Li-Hao-yuan / GeoThinker
View on GitHub
☆68Feb 12, 2026Updated 5 months ago
OuyangKun10 / SpaceR
View on GitHub
SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning
☆111Jul 9, 2025Updated last year
DripNowhy / Sherlock
View on GitHub
[NeurIPS 2025] Official Implementation of paper "Sherlock: Self-Correcting Reasoning in Vision-Language Models"
☆31Jun 4, 2026Updated last month
jlloyd237 / pyfranka
View on GitHub
A Python/C++ library for controlling the Franka Panda robot
☆15Jan 17, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ai4ce / INT-ACT
View on GitHub
Official repo for From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models
☆33Nov 2, 2025Updated 8 months ago
mll-lab-nu / MindCube
View on GitHub
☆163Mar 23, 2026Updated 3 months ago
Fr0zenCrane / UniCoT
View on GitHub
[ICLR 2026] Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision
☆233May 31, 2026Updated last month
FengheTan9 / HySparK
View on GitHub
[MICCAI 2024] HySparK: Hybrid Sparse Masking for Large Scale Medical Image Pre-Training
☆22Nov 17, 2024Updated last year
Zzzingzzz / LIMITI_SUMMER_CAMP_CV
View on GitHub
2023电子科技大学LIMITI机器人队夏令营视觉组线上学习内容
☆14Aug 14, 2024Updated last year
XYPB / CLEFT
View on GitHub
Official Implementation of "CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning" on MIC…
☆18Feb 12, 2025Updated last year
taco-group / NuScenes-SpatialQA
View on GitHub
☆19Apr 10, 2025Updated last year
FouierL / EquS
View on GitHub
[WACV 2026]Official Code of the paper “Equivariant Sampling for Improving Diffusion Model-based Image Restoration“
☆19Jan 29, 2026Updated 5 months ago
meera1hahn / Graph_LED
View on GitHub
Localization via embodied dialog on the navigation graph
☆15Apr 18, 2022Updated 4 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
KJ-Waller / DQN-PyTorch-Breakout
View on GitHub
An attempt at recreating DeepMind's implementation of Deep Q Learning on Atari Breakout using PyTorch
☆13Jan 16, 2020Updated 6 years ago
zhengxuJosh / Awesome-Multimodal-Spatial-Reasoning
View on GitHub
This repository collects and organises state‑of‑the‑art papers on spatial reasoning for Multimodal Vision–Language Models (MVLMs).
☆319Feb 17, 2026Updated 5 months ago
Beckschen / LLaVolta
View on GitHub
[NeurIPS 2024] Efficient Large Multi-modal Models via Visual Context Compression
☆66Feb 19, 2025Updated last year
lucasjinreal / MLLM_Factory
View on GitHub
A Dead Simple and Modularized Multi-Modal Training and Finetune Framework. Compatible to any LLaVA/Flamingo/QwenVL/MiniGemini etc series …
☆19Apr 24, 2024Updated 2 years ago
WuTao-CS / VideoMaker
View on GitHub
This is the official implementation of VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Mode…
☆17Mar 4, 2025Updated last year
DelinQu / pj-probe
View on GitHub
A Visualization Tool for GPU Occupancy on S Cluster.
☆13Nov 16, 2022Updated 3 years ago
LINs-lab / cluster_tutorial
View on GitHub
☆17Mar 19, 2026Updated 4 months ago
LiHaoHN / SimX-OR
View on GitHub
SimX-OR: Extending Any Simulation Benchmark to Evaluate the Observational Robustness of VLA Models
☆33Nov 4, 2025Updated 8 months ago
FYYDCC / IVT-LR
View on GitHub
Official repository for “Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space”
☆18Jan 27, 2026Updated 5 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
sangminwoo / RITUAL
View on GitHub
Official pytorch implementation of "RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in Large Vision Language…
☆14Dec 16, 2024Updated last year
lil-lab / knotgym
View on GitHub
[NeurIPS DB 2025] A gym environment for visual spatial reasoning - knot so simple :)
☆18Jun 9, 2026Updated last month
Bond1995 / Markov
View on GitHub
Code for experiments on transformers using Markovian data.
☆22Nov 22, 2024Updated last year
wrudman / NOTICE
View on GitHub
☆14Apr 10, 2025Updated last year
qmeng99 / Multiview-Motion-Estimation-for-3D-cardiac-motion-tracking
View on GitHub
Code for paper 'MulViMotion: Shape-aware 3D Myocardial Motion Tracking from Multi-View Cardiac MRI'
☆13Sep 2, 2022Updated 3 years ago
jyansir / tmlp
View on GitHub
[KDD 2024] Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular Prediction with Tree-hybrid MLPs
☆12Mar 3, 2025Updated last year
AnjieCheng / SpatialRGPT
View on GitHub
[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"
☆335Dec 14, 2024Updated last year