YuZhaoshu / Efficient-VLA-SurveyLinks
π₯This is a curated list of "A survey on Efficient Vision-Language Action Models" research. We will continue to maintain and update the repository, so follow us to keep up with the latest developments!!!
β22Updated this week
Alternatives and similar repositories for Efficient-VLA-Survey
Users that are interested in Efficient-VLA-Survey are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledgeβ197Updated last month
- starVLA: A Lego-like Codebase for Vision-Language-Action Model Developingβ79Updated this week
- HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Modelβ292Updated 2 weeks ago
- LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [Actively Maintainedπ₯]β161Updated 3 weeks ago
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"β111Updated 8 months ago
- ICCV2025β135Updated last month
- InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policyβ129Updated this week
- Unified Vision-Language-Action Modelβ207Updated this week
- [NeurIPS 2025] VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulationβ33Updated 3 weeks ago
- OpenHelix: An Open-source Dual-System VLA Model for Robotic Manipulationβ291Updated last month
- [NeurIPS 2025 Spotlight] SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulationβ194Updated 3 months ago
- WorldVLA: Towards Autoregressive Action World Modelβ445Updated last week
- Efficiently apply modification functions to RLDS/TFDS datasets.β25Updated last year
- [NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"β173Updated 2 weeks ago
- [arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligenceβ54Updated 2 months ago
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.β303Updated 2 months ago
- π¦Ύ A Dual-System VLA with System2 Thinkingβ112Updated last month
- Paper list in the survey: A Survey on Vision-Language-Action Models: An Action Tokenization Perspectiveβ283Updated 3 months ago
- Latest Advances on Vison-Language-Action Models.β113Updated 7 months ago
- [ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videosβ135Updated 2 weeks ago
- β323Updated 6 months ago
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).β121Updated last year
- A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulationβ358Updated 4 months ago
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`β137Updated 9 months ago
- [Arxiv 2025] Official code for MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Maβ¦β47Updated 2 months ago
- [NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understandingβ60Updated 3 weeks ago
- [CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioningβ47Updated 6 months ago
- Building General-Purpose Robots Based on Embodied Foundation Modelβ533Updated 3 weeks ago
- β164Updated last month
- Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"β86Updated last month