SIBench / Awesome-Visual-Spatial-ReasoningLinks
This is a project about visual spatial reasoning.
☆81Updated 2 weeks ago
Alternatives and similar repositories for Awesome-Visual-Spatial-Reasoning
Users that are interested in Awesome-Visual-Spatial-Reasoning are comparing it to the libraries listed below
Sorting:
- EO: Open-source Unified Embodied Foundation Model Series☆277Updated last month
- [2025CVPR] FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation☆44Updated last month
- GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models☆463Updated 2 months ago
- Official code release for paper "Robo-Imagine: A Robotic Video Generation Model, For Autoregressive Long-Term Task Video Generation With …☆28Updated 5 months ago
- [CVPR 2025] Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation☆74Updated 3 weeks ago
- ☆57Updated 5 months ago
- ☆23Updated last month
- Official repo of paper "SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models". A post-training framework that creates a cost-e…☆88Updated 3 weeks ago
- [ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Following☆114Updated last week
- A benchmark evaluates LLMs' performance in automating drawing revision tasks.☆56Updated 3 months ago
- vue3-elementPlus-admin,vue3-elementPlus-template☆56Updated last month
- [NeurIPS 2025 spotlight] Official implementation for "FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving…☆507Updated 2 months ago
- 🚀 Daily AI Research Digest: Tracking breakthroughs in AI/NLP/CV/Robotics with dynamic updates and paper navigation.☆51Updated this week
- VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model☆343Updated 8 months ago
- Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆91Updated 3 weeks ago
- TorchHook: A PyTorch hooks manager, providing convenient interfaces to capture feature maps and debug models.☆13Updated 2 months ago
- Official implementation of MC-LLaVA.☆139Updated last month
- ☆21Updated 3 weeks ago
- [ACM MM 2025] SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation. https://arxiv.org/abs/2506.03139☆74Updated last month
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆99Updated 5 months ago
- [ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection☆24Updated 5 months ago
- 这个算法用于无人机群避障一个加入机群的无人机,算法分为两种思路:(1)加入者的路径规划主动机动规避编队机群、(2)编队微调避让加入者。目前只做了第一种思路。唯一已知信息是原机群的运动轨迹F(x,y,z,t)|each plane,对于第一种思路:对于补位飞机唯一的输入参数是…☆28Updated 4 months ago
- Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence☆408Updated 2 weeks ago
- Survey: https://arxiv.org/pdf/2507.20198☆243Updated last month
- [ICCV 2025] HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation☆204Updated 5 months ago
- [CVPR 2025] Official implementation of paper "MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders".☆47Updated 6 months ago
- A framework for unified personalized model, achieving mutual enhancement between personalized understanding and generation. Demonstrating…☆127Updated 2 months ago
- Official Repository of ACL 2025 paper OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference☆144Updated 9 months ago
- A paper list for spatial reasoning☆521Updated last week
- [NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration☆100Updated 2 weeks ago