OpenHelix-Team / CEED-VLALinks
Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.
☆29Updated last month
Alternatives and similar repositories for CEED-VLA
Users that are interested in CEED-VLA are comparing it to the libraries listed below
Sorting:
- Official implementation for BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation☆73Updated last month
- Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces☆79Updated 2 months ago
- EO: Open-source Unified Embodied Foundation Model Series☆29Updated this week
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO☆70Updated 3 months ago
- Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆79Updated last month
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆147Updated 3 months ago
- Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning☆73Updated 3 months ago
- InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation☆39Updated last month
- ☆41Updated 2 months ago
- [CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision☆26Updated this week
- ☆13Updated last month
- 🦾 A Dual-System VLA with System2 Thinking☆99Updated last week
- NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks☆168Updated last month
- ☆37Updated 2 months ago
- [CVPR 2025 Highlight] Towards Autonomous Micromobility through Scalable Urban Simulation☆106Updated last month
- Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"☆138Updated last month
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆29Updated 11 months ago
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆163Updated 3 months ago
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning", https://arxiv.org/abs/2505.13934☆80Updated 2 months ago
- PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability☆24Updated 5 months ago
- ☆55Updated 6 months ago
- OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding☆59Updated last month
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆110Updated 3 months ago
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆120Updated last year
- DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge☆158Updated last week
- [ICCV 2025] GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene☆93Updated last month
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆68Updated 3 months ago
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆132Updated 8 months ago
- Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"☆51Updated last week
- Unified Vision-Language-Action Model☆185Updated last month