SIBench / Awesome-Visual-Spatial-ReasoningLinks
This is a project about visual spatial reasoning.
☆89Updated last month
Alternatives and similar repositories for Awesome-Visual-Spatial-Reasoning
Users that are interested in Awesome-Visual-Spatial-Reasoning are comparing it to the libraries listed below
Sorting:
- [2025CVPR] FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation☆51Updated 2 months ago
- Official code release for paper "Robo-Imagine: A Robotic Video Generation Model, For Autoregressive Long-Term Task Video Generation With …☆28Updated 6 months ago
- EO: Open-source Unified Embodied Foundation Model Series☆290Updated 2 months ago
- GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models☆486Updated 4 months ago
- ☆23Updated last month
- Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"☆282Updated last month
- Official repo of paper "SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models". A post-training framework that creates a cost-e…☆92Updated 2 months ago
- ☆59Updated 7 months ago
- 🚀 Daily AI Research Digest: Tracking breakthroughs in AI/NLP/CV/Robotics with dynamic updates and paper navigation.☆66Updated this week
- Official implementation of "Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance"☆374Updated 2 weeks ago
- A python script for downloading huggingface datasets and models.☆20Updated 10 months ago
- [NeurIPS 2025 spotlight] Official implementation for "FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving…☆596Updated 4 months ago
- Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence☆431Updated this week
- Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆109Updated last month
- Official implementation for "Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts"☆22Updated 7 months ago
- Official code repository of Shuffle-R1☆25Updated 2 weeks ago
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆203Updated 9 months ago
- [NeurIPS 2025]⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆271Updated 4 months ago
- [ACM MM 2025] SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation. https://arxiv.org/abs/2506.03139☆75Updated 3 months ago
- [ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection☆25Updated 7 months ago
- A paper list for spatial reasoning☆638Updated 3 weeks ago
- [CVPR 2025] Official implementation of paper "MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders".☆48Updated 8 months ago
- A benchmark evaluates LLMs' performance in automating drawing revision tasks.☆57Updated last month
- [arXiv 2025] Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps☆72Updated 3 weeks ago
- vue3-elementPlus-admin,vue3-elementPlus-template☆59Updated 2 months ago
- [ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Following☆116Updated 2 months ago
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆103Updated 7 months ago
- [ICLR 2025] The offical implementation of "PSEC: Skill Expansion and Composition in Parameter Space", a new framework designed to facilit…☆63Updated 11 months ago
- [ICCV 2025] HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation☆230Updated 6 months ago
- ☆22Updated 8 months ago