sled-group/COMFORT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sled-group/COMFORT)

sled-group / COMFORT

[ICLR 2025 Oral] Official Implementation for "Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities"

☆22

Alternatives and similar repositories for COMFORT

Users that are interested in COMFORT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cfpark00 / concept-learning
View on GitHub
Concept Learning Dynamics
☆17Oct 29, 2024Updated last year
KAIST-Visual-AI-Group / APC-VLM
View on GitHub
[ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation
☆66Sep 12, 2025Updated 10 months ago
VincentDENGP / 3D-LR
View on GitHub
Can 3D Vision-Language Models Truly Understand Natural Language?
☆20Mar 28, 2024Updated 2 years ago
uvavision / SyViC
View on GitHub
[ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data
☆13Sep 30, 2023Updated 2 years ago
hunarbatra / SpatialThinker
View on GitHub
SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards
☆40Jan 28, 2026Updated 5 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
AnjieCheng / SpatialRGPT
View on GitHub
[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"
☆335Dec 14, 2024Updated last year
nianticlabs / placeit3d
View on GitHub
[ICCV 2025] PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes
☆64Oct 3, 2025Updated 9 months ago
LiamLian0727 / Euclids_Gift
View on GitHub
[CVPR 2026 Fingdings] This repo is the official implementation of "Euclid’s Gift: Enhancing Spatial Perception and Reasoning in Vision‑La…
☆28Mar 15, 2026Updated 4 months ago
chengzu-li / MVoT
View on GitHub
Imagine While Reasoning in Space: Multimodal Visualization-of-Thought (ICML 2025)
☆78Apr 12, 2025Updated last year
damianomarsili / VADAR
View on GitHub
[CVPR 2025] Program synthesis for 3D spatial reasoning
☆61Jun 16, 2025Updated last year
Ingrid725 / LaPE
View on GitHub
☆19Mar 28, 2024Updated 2 years ago
arijitray1993 / SAT
View on GitHub
Spatial Aptitude Training for Multimodal Langauge Models
☆33Feb 8, 2026Updated 5 months ago
Haochen-Wang409 / ross3d
View on GitHub
[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
☆70Jul 22, 2025Updated last year
wjpoom / SPEC
View on GitHub
[CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
☆52Jun 16, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
FatemehShiri / Spatial-MM
View on GitHub
☆12Jan 10, 2025Updated last year
yliu-cs / SSR
View on GitHub
[NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
☆40Oct 14, 2025Updated 9 months ago
bundlesdf / bundlesdf.github.io
View on GitHub
☆14Jun 22, 2023Updated 3 years ago
transductive-visualprogram / tvp
View on GitHub
☆15Jan 7, 2026Updated 6 months ago
SuhZhang / GeoSR
View on GitHub
The code for paper 'Make Geometry Matter for Spatial Reasoning'
☆53Updated this week
XPR2004 / SpatialBench
View on GitHub
Code and dataset for paper "SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition"
☆19Mar 17, 2026Updated 4 months ago
ExplainableML / ImageSelect
View on GitHub
Code for the paper "If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection"
☆27Jul 10, 2023Updated 3 years ago
mengfeidu / EmbSpatial-Bench
View on GitHub
☆32Jun 24, 2024Updated 2 years ago
AlvinWen428 / spatial-relation-benchmark
View on GitHub
☆15Oct 12, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
xUhEngwAng / pinyin
View on GitHub
这个仓库包含了我在上人工智能课时完成的拼音输入法作业。
☆11Feb 16, 2022Updated 4 years ago
XingruiWang / 3D-Aware-VQA
View on GitHub
Official Code for the NeurIPS'23 paper "3D-Aware Visual Question Answering about Parts, Poses and Occlusions"
☆21Oct 17, 2024Updated last year
Chenyu-Wang567 / All-Angles-Bench
View on GitHub
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs
☆69Mar 22, 2026Updated 3 months ago
tokeron / DiffusionLens
View on GitHub
☆16Jan 30, 2025Updated last year
marco-garosi / COPS
View on GitHub
Official implementation of the WACV 2025 paper "3D Part Segmentation via Geometric Aggregation of 2D Visual Features"
☆25Jun 8, 2025Updated last year
STARE-bench / STARE
View on GitHub
☆19Oct 12, 2025Updated 9 months ago
AnjieCheng / SR-3D
View on GitHub
[ICLR'26] This repository is the implementation of "3D Aware Region Prompted Vision Language Model"
☆28Feb 19, 2026Updated 5 months ago
stogiannidis / srbench
View on GitHub
Source code for the Paper "Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models"
☆19Feb 1, 2026Updated 5 months ago
QC-LY / UiG
View on GitHub
Code for "Understanding-in-Generation:Reinforcing Generative Capability of Unified Model via Infusing Understanding into Generation"
☆15Nov 11, 2025Updated 8 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
behretj / LostFound
View on GitHub
[RA-L] Lost & Found dynamically tracks object poses from egocentric videos while updating a scene graph, enabling richer semantic 3D unde…
☆60Sep 29, 2025Updated 9 months ago
McGill-NLP / diffusion-itm
View on GitHub
Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"
☆33Mar 15, 2024Updated 2 years ago
johnson111788 / SpatialReasoner
View on GitHub
Training recipe for SpatialReasoner [NeurIPS 2025]
☆45Apr 5, 2026Updated 3 months ago
Vegetebird / CA-MLLM
View on GitHub
[ICLR 2026] Official implementation of the paper "📷 On the Generalization Capacities of MLLMs for Spatial Intelligence"
☆29Mar 17, 2026Updated 4 months ago
tiptop-robot / tiptop
View on GitHub
Official Repo for TiPToP: A Modular Open-Vocabulary Planning System for Robotic Manipulation
☆131Jun 24, 2026Updated 3 weeks ago
sled-group / RACER
View on GitHub
[ICRA 2025] RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning
☆46Oct 10, 2024Updated last year
mll-lab-nu / Awesome-Spatial-Intelligence-in-VLM
View on GitHub
A paper list for spatial reasoning
☆766Jan 19, 2026Updated 6 months ago