BAAI-DCAI / SpatialBot
The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.
☆219Updated last month
Alternatives and similar repositories for SpatialBot:
Users that are interested in SpatialBot are comparing it to the libraries listed below
- ☆296Updated last month
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆89Updated 2 months ago
- [ICML 2024] Official code repository for 3D embodied generalist agent LEO☆417Updated last month
- The Official Implementation of RoboMatrix☆83Updated 2 months ago
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World☆127Updated 4 months ago
- A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation☆182Updated last month
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆90Updated last month
- Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"☆231Updated 10 months ago
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆78Updated 3 weeks ago
- [ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation☆103Updated 3 weeks ago
- The official codebase for ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation(cvpr 2024)☆116Updated 8 months ago
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.☆162Updated last week
- Embodied Chain of Thought: A robotic policy that reason to solve the task.☆168Updated this week
- [ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model☆444Updated 4 months ago
- ☆155Updated 2 weeks ago
- Official implementation of "Data Scaling Laws in Imitation Learning for Robotic Manipulation"☆151Updated 4 months ago
- ☆62Updated 3 weeks ago
- [ECCV 2024] ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation☆215Updated 4 months ago
- ☆320Updated 10 months ago
- [CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆100Updated last week
- [IROS24 Oral]ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models☆85Updated 6 months ago
- ☆50Updated 3 weeks ago
- [NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"☆133Updated 2 months ago
- [arXiv 2023] Embodied Task Planning with Large Language Models☆170Updated last year