stogiannidis/srbench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/stogiannidis/srbench)

stogiannidis / srbench

Source code for the Paper "Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models"

☆19

Alternatives and similar repositories for srbench

Users that are interested in srbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

qizekun / OmniSpatial
View on GitHub
[ICLR 2026] OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models
☆88Jan 21, 2026Updated 6 months ago
Asterisci / Language-Assisted-3D
View on GitHub
[AAAI 2023 Oral] Language-Assisted 3D Feature Learning for Semantic Scene Understanding
☆12Aug 1, 2023Updated 2 years ago
EGO4D / ego-exo4d-egopose
View on GitHub
☆18Apr 16, 2024Updated 2 years ago
ASTRAL-Group / LoRe
View on GitHub
When Reasoning Meets Its Laws
☆38Jan 2, 2026Updated 6 months ago
ASTRAL-Group / AlphaOne
View on GitHub
[EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
☆89Jun 10, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
KAIST-Visual-AI-Group / APC-VLM
View on GitHub
[ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation
☆66Sep 12, 2025Updated 10 months ago
Yukun66 / Video_Director
View on GitHub
VideoDirector [CVPR 2025]
☆36Nov 25, 2025Updated 7 months ago
aimagelab / COGT
View on GitHub
[ICLR 2025] Causal Graphical Models for Vision-Language Compositional Understanding
☆10Apr 15, 2025Updated last year
rongyaofang / FeatAug-DETR
View on GitHub
Official repository of paper: "FeatAug-DETR: Enriching One-to-Many Matching for DETRs with Feature Augmentation"
☆26Mar 2, 2023Updated 3 years ago
zsc000722 / PPT
View on GitHub
☆20Sep 27, 2024Updated last year
YBZh / MaskSurf
View on GitHub
Masked Surfel Prediction for Self-Supervised Point Cloud Learning
☆27Dec 6, 2023Updated 2 years ago
zhangbw17 / MV-Adapter
View on GitHub
An official pytorch implementation of the paper: [MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval].
☆14Jul 27, 2024Updated last year
apple / ml-space-benchmark
View on GitHub
Code and data for "Does Spatial Cognition Emerge in Frontier Models?"
☆29Apr 18, 2025Updated last year
xhanxu / MoST
View on GitHub
[CVPR 2025] MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
☆21Sep 20, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
farakiko / ImageSegmentationPASCAL
View on GitHub
☆14Jun 5, 2020Updated 6 years ago
SAGNIKMJR / ego-AV-spatial-correspondence
View on GitHub
[CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'
☆14Jun 16, 2024Updated 2 years ago
xiaoneil / LPNet
View on GitHub
☆13Nov 28, 2021Updated 4 years ago
facebookresearch / ego-env
View on GitHub
Human-centric environment representations from egocentric video
☆15Feb 5, 2026Updated 5 months ago
jicheol93 / PLOT
View on GitHub
☆13Feb 13, 2025Updated last year
MediaBrain-SJTU / GSC
View on GitHub
☆14Jul 13, 2024Updated 2 years ago
BolinLai / CSTS
View on GitHub
[ECCV2024] The official implementation of "Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation".
☆16Feb 24, 2025Updated last year
CrossmodalGroup / ESL
View on GitHub
☆12May 3, 2024Updated 2 years ago
gyx-gloria / DMT
View on GitHub
Official Implementation of DMT: Dual Mean-Teacher in PyTorch.
☆10Oct 27, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
qizekun / VPP
View on GitHub
[NeurIPS 2023] VPP: Efficient Conditional 3D Generation via Voxel-Point Progressive Representation
☆37Jun 27, 2024Updated 2 years ago
Harvard-AI-and-Robotics-Lab / FiVE-Bench
View on GitHub
[ICCV 2025] FiVE-Bench: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models
☆19Aug 26, 2025Updated 10 months ago
antonybudianto / next-gpt
View on GitHub
Opinionated ChatGPT Client with Next.js, Tailwinds, Firebase, Vercel Edge Streaming.
☆11Apr 1, 2025Updated last year
andreaskontogiannis / tres
View on GitHub
Official code implementation of "Tree-based Focused Web Crawling with Reinforcement Learning" and the TRES framework
☆24Feb 16, 2026Updated 5 months ago
DynaMath / DynaMath
View on GitHub
A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
☆30Nov 25, 2024Updated last year
Minglu58 / TA2V
View on GitHub
☆16Dec 1, 2025Updated 7 months ago
Jahawn-Wen / CAMeL-reID
View on GitHub
[IEEE Transactions on Information Forensics and Security'25] Pytorch implementation of CAMeL: Cross-modality Adaptive Meta-Learning for T…
☆17Jan 5, 2026Updated 6 months ago
ZhangWeihang99 / HVSA
View on GitHub
Official PyTorch implementation for Hypersphere-Based Remote Sensing Cross-Modal Text–Image Retrieval via Curriculum Learning.
☆16Aug 10, 2024Updated last year
mengfeidu / EmbSpatial-Bench
View on GitHub
☆32Jun 24, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
KovenYu / EON-VoteNet
View on GitHub
☆23Jul 3, 2022Updated 4 years ago
MichWozPol / LEGO_StableDiffusion
View on GitHub
The project aim was to fine-tune the stable diffusion model in order to generate images in the LEGO style based on the prompt.
☆16Jun 7, 2023Updated 3 years ago
gchochla / LLM-multilabel-differently
View on GitHub
[Main EMNLP'25] LLMs do Multi-Label Classification Differently
☆18Feb 28, 2026Updated 4 months ago
Ji-Haoyang / FGVLA
View on GitHub
The code of Fine-Grained Visual-Language Alignment for Remote Sensing Image-Text Retrieval（IEEE Transactions on Geoscience and Remote Sen…
☆15Jun 30, 2025Updated last year
Tangkexian / LEGO-Puzzles
View on GitHub
Benchmarking Multi-Step Spatial Reasoning in MLLMs with LEGO-based VQA & generation tasks.
☆37Jun 20, 2025Updated last year
PKU-EPIC / GAPartNet
View on GitHub
[CVPR 2023 Highlight] GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable …
☆162Oct 29, 2024Updated last year
XLearning-SCU / 2024-TIP-CREAM
View on GitHub
PyTorch implementation for Cross-modal Retrieval with Noisy Correspondence via Consistency Refining and Mining (TIP 2024)
☆22Mar 25, 2024Updated 2 years ago