johnson111788/SpatialReasoner

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/johnson111788/SpatialReasoner)

johnson111788 / SpatialReasoner

Training recipe for SpatialReasoner [NeurIPS 2025]

☆45

Alternatives and similar repositories for SpatialReasoner

Users that are interested in SpatialReasoner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

wufeim / SpatialReasonerDataGen
View on GitHub
Synthetic VQA data generation code for SpatialReasoner.
☆20Nov 25, 2025Updated 7 months ago
zhangquanchen / 3DThinker
View on GitHub
[CVPR 2026] Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views
☆243May 7, 2026Updated 2 months ago
THU-SI / Spatial-MLLM
View on GitHub
[NeurIPS 2025 Spotlight] Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
☆479Feb 5, 2026Updated 5 months ago
shiqichen17 / AdaptVis
View on GitHub
Github repository for "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas" (ICML 2025)
☆76May 2, 2025Updated last year
LaVi-Lab / VG-LLM
View on GitHub
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
☆245Nov 28, 2025Updated 7 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
hunarbatra / SpatialThinker
View on GitHub
SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards
☆40Jan 28, 2026Updated 5 months ago
yliu-cs / SSR
View on GitHub
[NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
☆40Oct 14, 2025Updated 9 months ago
UCSC-VLAA / MixCon3D
View on GitHub
[CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"
☆35Apr 21, 2024Updated 2 years ago
sled-group / COMFORT
View on GitHub
[ICLR 2025 Oral] Official Implementation for "Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Un…
☆22Oct 24, 2024Updated last year
AnjieCheng / SpatialRGPT
View on GitHub
[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"
☆335Dec 14, 2024Updated last year
ApoorvaBeedu / VideoPose
View on GitHub
☆17Feb 14, 2023Updated 3 years ago
HanchenTai / OV-SAM3D
View on GitHub
Open-Vocabulary SAM3D: Understand Any 3D Scene
☆44Jun 9, 2025Updated last year
mengcaopku / SpatialDreamer
View on GitHub
SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery
☆15Feb 1, 2026Updated 5 months ago
wufeim / imagenet3d
View on GitHub
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
☆21Dec 6, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
UMass-Embodied-AGI / MindJourney
View on GitHub
[NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"
☆151Nov 4, 2025Updated 8 months ago
arijitray1993 / SAT
View on GitHub
Spatial Aptitude Training for Multimodal Langauge Models
☆33Feb 8, 2026Updated 5 months ago
damianomarsili / VADAR
View on GitHub
[CVPR 2025] Program synthesis for 3D spatial reasoning
☆61Jun 16, 2025Updated last year
Visual-AI / 3DRS
View on GitHub
[NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
☆158Dec 9, 2025Updated 7 months ago
marco-garosi / COPS
View on GitHub
Official implementation of the WACV 2025 paper "3D Part Segmentation via Geometric Aggregation of 2D Visual Features"
☆25Jun 8, 2025Updated last year
sg-3d / sg3d
View on GitHub
☆55Oct 3, 2024Updated last year
YunzeMan / Situation3D
View on GitHub
[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning
☆44Dec 9, 2024Updated last year
ZJU-REAL / SpatialLadder
View on GitHub
[ICLR 2026] SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models
☆99Jun 9, 2026Updated last month
WU-CVGL / GS-Reasoner
View on GitHub
Reasoning in Space via Grounding in the World (ICLR 2025)
☆56Nov 3, 2025Updated 8 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
KAIST-Visual-AI-Group / PartSTAD
View on GitHub
Official implementation of PartSTAD: 2D-to-3D Part Segmentation Task Adaptation (ECCV 2024).
☆56Nov 7, 2024Updated last year
microsoft / HiSpatial
View on GitHub
[CVPR 2026] HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models
☆37Jul 2, 2026Updated 2 weeks ago
wj-on-un / AffordanceLLM_implementation
View on GitHub
(Incomplete version) This is an implementation of affordancellm.
☆19Oct 17, 2024Updated last year
enlighten0707 / Symbol-LLM
View on GitHub
Code for NeurIPS2023 Paper "Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning"
☆26Dec 19, 2023Updated 2 years ago
beacon-3d / Beacon3D
View on GitHub
[CVPR 2025] Beacon3D: Object-centric Evaluation for 3D Grounding-QA
☆28Nov 25, 2025Updated 7 months ago
Yangr116 / VST
View on GitHub
[ECCV2026] Visual Spatial Tuning
☆198Mar 25, 2026Updated 3 months ago
kaist-cvml / geometric-distillation
View on GitHub
[EMNLP 2025 Findings] 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation
☆39Jun 12, 2025Updated last year
DoHunLee1 / VideoGuide
View on GitHub
[CVPR2025] Official repository for "VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide"
☆30May 27, 2025Updated last year
KAIST-Visual-AI-Group / APC-VLM
View on GitHub
[ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation
☆66Sep 12, 2025Updated 10 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
gca-spatial-reasoning / gca
View on GitHub
Official Implementation of "Geometrically-Constrained Agent for Spatial Reasoning"
☆89Apr 7, 2026Updated 3 months ago
amitakamath / vl_text_encoders_are_bottlenecks
View on GitHub
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11May 24, 2023Updated 3 years ago
zhangzaibin / spagent
View on GitHub
SPAgent, a foundation agent for understanding, reasoning over, and operating within the physical and spatial world.
☆198Updated this week
hjy-u / ETOG
View on GitHub
[ICRA 2025] A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping
☆13Feb 7, 2025Updated last year
haoningwu3639 / SpatialScore
View on GitHub
[CVPR 2026 Highlight] SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
☆84May 28, 2026Updated last month
ShijieZhou-UCLA / VLM4D
View on GitHub
[ICCV 2025] VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
☆55Nov 20, 2025Updated 8 months ago
mll-lab-nu / MindCube
View on GitHub
☆163Mar 23, 2026Updated 3 months ago