YuZhaoshu / Efficient-VLAs-SurveyLinks

🔥This is a curated list of "A survey on Efficient Vision-Language Action Models" research. We will continue to maintain and update the repository, so follow us to keep up with the latest developments!!!

☆110

Alternatives and similar repositories for Efficient-VLAs-Survey

Users that are interested in Efficient-VLAs-Survey are comparing it to the libraries listed below

Sorting:

starVLA / starVLA
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
☆725Updated this week
Psi-Robot / Awesome-VLA-Papers
Paper list in the survey: A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
☆395Updated 6 months ago
PKU-HMI-Lab / Hybrid-VLA
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
☆333Updated 3 months ago
siyuhsu / vla-cache
[NeurIPS 2025] VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulation
☆58Updated 3 months ago
InternRobotics / InternVLA-M1
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
☆330Updated last week
OpenHelix-Team / LLaVA-VLA
LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [Actively Maintained🔥]
☆173Updated 2 months ago
OpenHelix-Team / OpenHelix
OpenHelix: An Open-source Dual-System VLA Model for Robotic Manipulation
☆333Updated 4 months ago
yueen-ma / Awesome-VLA
☆423Updated 3 weeks ago
X-Square-Robot / wall-x
Building General-Purpose Robots Based on Embodied Foundation Model
☆645Updated last month
yueyang130 / DeeR-VLA
Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"
☆122Updated 10 months ago
dexmal / realtime-vla
Running VLA at 30Hz frame rate and 480Hz trajectory frequency
☆338Updated last week
MINT-SJTU / Evo-1
Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment
☆197Updated 3 weeks ago
Zhangwenyao1 / DreamVLA
[NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
☆273Updated this week
lmzpai / roboMamba
The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`
☆148Updated last year
JiuTian-VL / Large-VLM-based-VLA-for-Robotic-Manipulation
A curated list of large VLM-based VLA models for robotic manipulation.
☆293Updated 3 weeks ago
PKU-Alignment / VLA-Arena
VLA-Arena is an open-source benchmark for systematic evaluation of Vision-Language-Action (VLA) models.
☆97Updated this week
Dexmal / dexbotic
Dexbotic: Open-Source Vision-Language-Action Toolbox
☆646Updated last week
baaivision / UniVLA
Unified Vision-Language-Action Model
☆257Updated 2 months ago
HeegerGao / VLA-OS
Official Code For VLA-OS.
☆131Updated 6 months ago
alibaba-damo-academy / RynnVLA-001
RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation
☆275Updated last month
BAAI-DCAI / SpatialBot
The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.
☆326Updated 3 months ago
SpatialVLA / SpatialVLA
🔥 SpatialVLA: a spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes. Accepted at RSS 2025.
☆617Updated 6 months ago
tulerfeng / Awesome-Embodied-Multimodal-LLMs
Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).
☆122Updated last year
Zhoues / RoboRefer
[NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"
☆218Updated 3 weeks ago
OpenGalaxea / GalaxeaVLA
Galaxea's first VLA release
☆471Updated this week
alibaba-damo-academy / RynnVLA-002
RynnVLA-002: A Unified Vision-Language-Action and World Model
☆818Updated last month
mit-han-lab / vlash
Real-Time VLAs via Future-state-aware Asynchronous Inference.
☆264Updated 3 weeks ago
DelinQu / awesome-vision-language-action-model
Latest Advances on Vison-Language-Action Models.
☆124Updated 10 months ago
microsoft / CogACT
A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
☆395Updated 2 months ago
2toinf / X-VLA
The offical Implementation of "Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model"
☆404Updated last week