LiAutoAD / LightVLALinks
LightVLA
☆73Updated last month
Alternatives and similar repositories for LightVLA
Users that are interested in LightVLA are comparing it to the libraries listed below
Sorting:
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆121Updated last year
- LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [Actively Maintained🔥]☆174Updated 3 months ago
- 🏆 Official implementation of LangCoop: Collaborative Driving with Natural Language☆75Updated 4 months ago
- ☆87Updated 8 months ago
- MiMo-Embodied☆345Updated 2 months ago
- ☆23Updated 4 months ago
- A Multi-Modal Large Language Model with Retrieval-augmented In-context Learning capacity designed for generalisable and explainable end-t…☆120Updated last year
- [NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"☆222Updated last month
- Nav-R1: Reasoning and Navigation in Embodied Scenes☆108Updated 3 months ago
- HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model☆336Updated 3 months ago
- EO: Open-source Unified Embodied Foundation Model Series☆44Updated 2 weeks ago
- [ACM CSUR 2025] Understanding World or Predicting Future? A Comprehensive Survey of World Models☆413Updated 2 months ago
- Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models☆137Updated 3 weeks ago
- [arXiv 2025] Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps☆71Updated 2 weeks ago
- 🔥 A curated roadmap to the Efficient VLA landscape. We’re keeping this list live—contribute your latest work!☆72Updated this week
- ☆63Updated last month
- 🔥This is a curated list of "A survey on Efficient Vision-Language Action Models" research. We will continue to maintain and update the r…☆118Updated 3 weeks ago
- [NeurIPS 2025] SURDS: Benchmarking Spatial Understanding and Reasoning in Driving Scenarios with Vision Language Models☆78Updated 4 months ago
- [NeurIPS 2025] VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulation☆65Updated 4 months ago
- [NeurIPS 2025]⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆264Updated 3 months ago
- [NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning☆38Updated 3 months ago
- [NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge☆282Updated 3 weeks ago
- [TMLR'25] AutoTrust, a groundbreaking benchmark designed to assess the trustworthiness of DriveVLMs. This work aims to enhance public saf…☆52Updated 2 months ago
- Latest Advances on Vison-Language-Action Models.☆128Updated 10 months ago
- Benchmark and model for step-by-step reasoning in autonomous driving.☆68Updated 10 months ago
- Simulator designed to generate diverse driving scenarios.☆43Updated 11 months ago
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆124Updated 11 months ago
- ☆41Updated 7 months ago
- ☆53Updated last month
- The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight☆75Updated 2 weeks ago