LiAutoAD / LightVLALinks
LightVLA
☆56Updated last week
Alternatives and similar repositories for LightVLA
Users that are interested in LightVLA are comparing it to the libraries listed below
Sorting:
- [arXiv 2025] Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps☆68Updated last week
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆121Updated last year
- [NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"☆185Updated last week
- Nav-R1: Reasoning and Navigation in Embodied Scenes☆60Updated last month
- LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [Actively Maintained🔥]☆163Updated last month
- Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.☆40Updated last month
- Latest Advances on Vison-Language-Action Models.☆116Updated 7 months ago
- ☆41Updated 4 months ago
- [TMLR'25] AutoTrust, a groundbreaking benchmark designed to assess the trustworthiness of DriveVLMs. This work aims to enhance public saf…☆51Updated 10 months ago
- ☆84Updated 5 months ago
- HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model☆304Updated last month
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO☆74Updated 5 months ago
- Official code for "Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation"☆89Updated 2 months ago
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934☆130Updated this week
- WorldVLA: Towards Autoregressive Action World Model☆472Updated 3 weeks ago
- Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces☆84Updated 4 months ago
- 🦾 A Dual-System VLA with System2 Thinking☆114Updated 2 months ago
- [NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge☆203Updated last month
- ☆54Updated 7 months ago
- [NeurIPS 2025]⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆229Updated 3 weeks ago
- ☆76Updated last week
- InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy☆219Updated last week
- [CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Lan…☆61Updated 7 months ago
- A Multi-Modal Large Language Model with Retrieval-augmented In-context Learning capacity designed for generalisable and explainable end-t…☆111Updated last year
- NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks☆180Updated 3 months ago
- This repository is the official implementation of our paper (From reactive to cognitive: brain-inspired spatial intelligence for embodied…☆62Updated last month
- ☆31Updated 2 weeks ago
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆176Updated last month
- ☆19Updated last year
- [NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning☆25Updated 2 weeks ago