prism-visual-spatial-intelligence / Awesome-Visual-Spatial-ReasoningLinks

This is a project about visual spatial reasoning.

☆53

Alternatives and similar repositories for Awesome-Visual-Spatial-Reasoning

Users that are interested in Awesome-Visual-Spatial-Reasoning are comparing it to the libraries listed below

Sorting:

Qiukunpeng / Siamese-Diffusion
[CVPR 2025] Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation
☆58Updated this week
jailflip / jailflip-2025
☆23Updated last week
solitaryTian / RLCFM
☆42Updated 2 months ago
Metaphysicist0 / Embodied-Intelligence-in-Endovascular-Robot-Navigation
Embodied Intelligence in Endovascular Robot Navigation -- 血管介入手术机器人具身导航
☆11Updated 3 months ago
SanMumumu / FlowRAM
[2025CVPR] FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation
☆27Updated last month
Egbert-Lannister / Robo-Imagine
Official code release for paper "Robo-Imagine: A Robotic Video Generation Model, For Autoregressive Long-Term Task Video Generation With …
☆23Updated last month
ZJU-REAL / SVGenius
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation. https://arxiv.org/abs/2506.03139
☆62Updated 2 months ago
dedaow / OTFS-channel-estimation
OTFS-channel-estimation
☆26Updated 2 months ago
yanghlll / ScalingNoise
☆39Updated 5 months ago
SYuan03 / MM-IFEngine
[ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Following
☆101Updated 4 months ago
ArtmeScienceLab / FonTS
[ICCV 2025] FonTS: Text Rendering with Typography and Style Controls
☆23Updated this week
OpenDCAI / Awesome_MLLMs_Reasoning
☆104Updated last month
BIT-DA / ABS
[ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection
☆22Updated 2 months ago
PhoenixZ810 / OmniAlign-V
Official Repository of ACL 2025 paper OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference
☆144Updated 6 months ago
ADaM-BJTU / Mind_with_eyes_Awesome_MLLMs_Reasoning
This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!
☆48Updated 5 months ago
PrismaX-Team / PhysUniBenchmark
☆21Updated 2 months ago
Video-R1 / Awesome-Multimodal-Reasoning
Collections of Papers and Projects for Multimodal Reasoning.
☆105Updated 4 months ago
lwpyh / Awesome-MLLM-Reasoning-Collection
A collection of multimodal reasoning papers, codes, datasets, benchmarks and resources.
☆292Updated last week
Qznan / GeRe
☆25Updated 3 weeks ago
Eason-Li-AIS / DrafterBench
A benchmark evaluates LLMs' performance in automating drawing revision tasks.
☆57Updated last week
arctanxarc / MC-LLaVA
Official implementation of MC-LLaVA.
☆139Updated last week
jungao1106 / ICoT
[CVPR' 25] Interleaved-Modal Chain-of-Thought
☆80Updated last week
zzaiyan / TorchHook
TorchHook: A PyTorch hooks manager, providing convenient interfaces to capture feature maps and debug models.
☆12Updated 3 months ago
huofushuo / SID
https://arxiv.org/abs/2408.02032
☆118Updated 7 months ago
chocho-1115 / vue-admin
vue3-elementPlus-admin,vue3-elementPlus-template
☆31Updated this week
hhnqqq / py_hfd
A python script for downloading huggingface datasets and models.
☆19Updated 4 months ago
MLRM-Halu / MLRM-Halu
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
☆48Updated 3 months ago
Wang-Xiaodong1899 / CVPR25-MLLM-Paper-List
🔥CVPR 2025 Multimodal Large Language Models Paper List
☆152Updated 5 months ago
Ghy0501 / HiDe-LLaVA
[ACL'25 Main] Official Implementation of HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Languag…
☆25Updated last month
saccharomycetes / mllms_know
[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'
☆249Updated 4 months ago