SIBench / Awesome-Visual-Spatial-ReasoningLinks
This is a project about visual spatial reasoning.
☆76Updated last week
Alternatives and similar repositories for Awesome-Visual-Spatial-Reasoning
Users that are interested in Awesome-Visual-Spatial-Reasoning are comparing it to the libraries listed below
Sorting:
- Official code release for paper "Robo-Imagine: A Robotic Video Generation Model, For Autoregressive Long-Term Task Video Generation With …☆27Updated 3 months ago
- [2025CVPR] FlowRAM: Grounding Flow Matching Policy with Region-Aware Mamba Framework for Robotic Manipulation☆36Updated 2 months ago
- EO: Open-source Unified Embodied Foundation Model Series☆265Updated 3 weeks ago
- [CVPR 2025] Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation☆70Updated last month
- GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models☆449Updated last month
- ☆53Updated 4 months ago
- TorchHook: A PyTorch hooks manager, providing convenient interfaces to capture feature maps and debug models.☆12Updated last month
- 这个算法用于无人机群避障一个加入机群的无人机,算法分为两种思路:(1)加入者的路径规划主动机动规避编队机群、(2)编队微调避让加入者。目前只做了第一种思路。唯一已知信息是原机群的运动轨迹F(x,y,z,t)|each plane,对于第一种思路:对于补位飞机唯一的输入参数是…☆26Updated 2 months ago
- [ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Following☆109Updated last month
- ☆22Updated 2 months ago
- vue3-elementPlus-admin,vue3-elementPlus-template☆49Updated last week
- A benchmark evaluates LLMs' performance in automating drawing revision tasks.☆56Updated 2 months ago
- [ACM MM 2025] SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation. https://arxiv.org/abs/2506.03139☆67Updated 4 months ago
- Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences (ICML 2025)☆25Updated 4 months ago
- A python script for downloading huggingface datasets and models.☆20Updated 6 months ago
- Embodied Intelligence in Endovascular Robot Navigation -- 血管介入手术机器人具身导航☆17Updated last week
- [NeurIPS 2025]⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆230Updated last month
- [CVPR 2024] RankMatch: Exploring the Better Consistency Regularization for Semi-supervised Semantic Segmentation☆30Updated 8 months ago
- [ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection☆24Updated 4 months ago
- Official Repository of ACL 2025 paper OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference☆143Updated 8 months ago
- 用户面试平台☆23Updated 3 months ago
- About Official repo of paper "SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models". A post-training framework that creates a …☆81Updated 3 weeks ago
- A framework for unified personalized model, achieving mutual enhancement between personalized understanding and generation. Demonstrating…☆123Updated last month
- VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model☆345Updated 6 months ago
- Official implementation of MC-LLaVA.☆140Updated 2 months ago
- Collections of Papers and Projects for Multimodal Reasoning.☆104Updated 6 months ago
- Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"☆76Updated last month
- [CVPR 2025] Official implementation of paper "MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders".☆45Updated 5 months ago
- ☆20Updated last month
- [NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration☆87Updated 5 months ago