arijitray1993/SAT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/arijitray1993/SAT)

arijitray1993 / SAT

Spatial Aptitude Training for Multimodal Langauge Models

☆24

Alternatives and similar repositories for SAT

Users that are interested in SAT are comparing it to the libraries listed below

Sorting:

hany01rye / tiger
View on GitHub
TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
☆22Nov 18, 2025Updated 3 months ago
ys-zong / MIRB
View on GitHub
Benchmarking Multi-Image Understanding in Vision and Language Models
☆12Jul 29, 2024Updated last year
mahtabbigverdi / Aurora-perception
View on GitHub
☆46Feb 18, 2026Updated 2 weeks ago
ap229997 / wild-hoi
View on GitHub
[ECCV'24] 3D Reconstruction of Objects in Hands without Real World 3D Supervision
☆17Feb 3, 2025Updated last year
KAIST-Visual-AI-Group / StochSync
View on GitHub
Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronizat…
☆21Jun 24, 2025Updated 8 months ago
sled-group / COMFORT
View on GitHub
[ICLR 2025 Oral] Official Implementation for "Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Un…
☆21Oct 24, 2024Updated last year
vulab-AI / Awesome-Spatial-VLMs
View on GitHub
[Awesome-Spatial-VLMs] This repository is the official, community-maintained resource for the survey paper: Spatial Intelligence in Visio…
☆64Feb 16, 2026Updated 2 weeks ago
XIAO4579 / Vlm-interpretability
View on GitHub
Official implementation for the paper"Towards Understanding How Knowledge Evolves in Large Vision-Language Models"
☆28Apr 10, 2025Updated 10 months ago
kerrj / eyerobot
View on GitHub
Source code for EyeRobot
☆41Dec 1, 2025Updated 3 months ago
Jaraxxus-Me / LogiCity
View on GitHub
LogiCity@NeurIPS'24, D&B track. A multi-agent inductive learning environment for "abstractions".
☆27Jun 10, 2025Updated 8 months ago
IRVLUTD / SceneReplica
View on GitHub
Code release for SceneReplica paper.
☆29Jul 24, 2025Updated 7 months ago
psmmettes / spatial-aware-object-embeddings
View on GitHub
Code base for zero-shot action localization through spatial-aware object embeddings
☆25Nov 3, 2017Updated 8 years ago
MINT-SJTU / STI-Bench
View on GitHub
STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?
☆37Jan 12, 2026Updated last month
wentaoyuan / RoboPoint
View on GitHub
A Vision-Language Model for Spatial Affordance Prediction in Robotics
☆214Jul 17, 2025Updated 7 months ago
TritiumR / Prompting-with-the-Future
View on GitHub
Implementation of Prompting with the Future: Open-World Model Predictive Control with Interactive Digital Twins. [RSS 2025]
☆49Oct 21, 2025Updated 4 months ago
amitakamath / whatsup_vlms
View on GitHub
Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".
☆71Feb 28, 2024Updated 2 years ago
wz0919 / AnchorWeave
View on GitHub
Official implementation of AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories
☆78Feb 17, 2026Updated 2 weeks ago
ricl-vla / ricl_openpi
View on GitHub
☆56Aug 7, 2025Updated 6 months ago
NVlabs / L4P
View on GitHub
(3DV 2026 Oral) L4P -- a feed-forward foundational model designed for multiple low-level 4D vision perception tasks.
☆60Dec 9, 2025Updated 2 months ago
zhouyiks / CoLVA
View on GitHub
☆42Jul 9, 2025Updated 7 months ago
NVlabs / AHA
View on GitHub
A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
☆58Apr 1, 2025Updated 11 months ago
ControlVLA / ControlVLA
View on GitHub
Code Repository for ControlVLA, CoRL2025.
☆85Oct 26, 2025Updated 4 months ago
taco-group / STAMP
View on GitHub
[ICLR'25] Official Implementation of STAMP: Scalable Task And Model-agnostic Collaborative Perception
☆56Feb 4, 2025Updated last year
TOPReward / TOPReward
View on GitHub
TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics
☆42Updated this week
wkindling / Skeleton-Poser
View on GitHub
☆11May 3, 2019Updated 6 years ago
google-deepmind / neural_assets
View on GitHub
☆20Sep 5, 2025Updated 5 months ago
THUNLP-MT / Scaffold
View on GitHub
Scaffold Prompting to promote LMMs
☆46Dec 16, 2024Updated last year
MM-FIRE / FIRE
View on GitHub
☆13Nov 5, 2024Updated last year
bammt / Learn-to-check
View on GitHub
the datasets of our paper
☆11Feb 26, 2024Updated 2 years ago
cyrusneary / vlaps
View on GitHub
☆20Aug 22, 2025Updated 6 months ago
dcarpintero / minerva
View on GitHub
Multi-Agent LLM System for Digital Scam Protection
☆12Dec 19, 2024Updated last year
joanby / curso-numerico-3
View on GitHub
Tercera y última parte de la saga de métodos numéricos con Python
☆11May 30, 2022Updated 3 years ago
sled-group / RACER
View on GitHub
[ICRA 2025] RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning
☆41Oct 10, 2024Updated last year
MarionLepert / phantom
View on GitHub
☆77Aug 29, 2025Updated 6 months ago
robo-arena / roboarena
View on GitHub
Distributed, scalable benchmarking of generalist robot policies.
☆84Feb 10, 2026Updated 3 weeks ago
ashishpatel26 / IBM-Quantum-Challenge-Spring-2023-Challenge
View on GitHub
IBM Quantum Challenge Fall 2023
☆10May 23, 2023Updated 2 years ago
kaist-ami / AVHBench
View on GitHub
[ICLR'25] Official repository for "AVHBench: A Cross-Modal Hallucination Evaluation for Audio-Visual Large Language Models"
☆20Feb 25, 2026Updated last week
alibaba-damo-academy / EOCBench
View on GitHub
[NeurIPS 2025] EOC-Bench, an innovative benchmark designed to systematically evaluate object-centric embodied cognition in dynamic egocen…
☆22Jun 17, 2025Updated 8 months ago
morning9393 / ETPO
View on GitHub
☆14Mar 5, 2024Updated 2 years ago