Jiaaqiliu / Awesome-VLA-RoboticsLinks
A comprehensive list of excellent research papers, models, datasets, and other resources on Vision-Language-Action (VLA) models in robotics.
☆443Updated 2 weeks ago
Alternatives and similar repositories for Awesome-VLA-Robotics
Users that are interested in Awesome-VLA-Robotics are comparing it to the libraries listed below
Sorting:
- Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method (CVPR-25)☆186Updated 2 months ago
- BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence☆234Updated 5 months ago
- Official implementation of OpenTrack.☆700Updated last month
- Official implementation of OpenWBT.☆777Updated 3 months ago
- R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization☆432Updated 3 weeks ago
- codebase for iccv 2025 paper "One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory"☆121Updated 3 months ago
- Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos☆296Updated last month
- ☆97Updated last month
- CausalVLR: A Toolbox and Benchmark for Vision-Language Causal Reasoning (多模态因果推理开源框架)☆1,145Updated last month
- This is the official repository for C3-OWD: A Curriculum Cross-modal Contrastive Learning Framework for Open-World Detection☆122Updated last month
- OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models☆139Updated 6 months ago
- SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning☆960Updated last month
- This repository summarizes recent advances in the VLA + RL paradigm and provides a taxonomic classification of relevant works.☆324Updated last month
- TaskExp is a generic multi-task pre-training algorithm to enhance the generalization of learning-based multi-robot exploration policies.☆30Updated 2 months ago
- The offical Implementation of "Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model"☆198Updated this week
- Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression☆91Updated 10 months ago
- ☆344Updated 2 weeks ago
- A curated list of large VLM-based VLA models for robotic manipulation.☆238Updated last week
- [ICLR 2025] CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs☆128Updated 5 months ago
- Papers list of empathy in LMs: theory, modeling, systems, emotion, evaluation.☆74Updated 3 months ago
- This is the official implementation of the paper "ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy".☆287Updated this week
- Single-file implementation to advance vision-language-action (VLA) models with reinforcement learning.☆333Updated last week
- HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model☆315Updated last month
- [RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions☆827Updated last week
- [Actively Maintained🔥] A list of Embodied AI papers accepted by top conferences (ICLR, NeurIPS, ICML, RSS, CoRL, ICRA, IROS, CVPR, ICCV,…☆405Updated 2 weeks ago
- [IEEE TASE] The Official Implementation for ''Boosting Global-Local Feature Matching via Anomaly Synthesis for Multi-Class Point Cloud An…☆27Updated 5 months ago
- 🔥 SpatialVLA: a spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes. Accepted at RSS 2025.☆565Updated 4 months ago
- [ECAI 2024] MoSt-DSA: Modeling Motion and Structural Interactions for Direct Multi-Frame Interpolation in DSA Images☆12Updated 11 months ago
- Paper list in the survey: A Survey on Vision-Language-Action Models: An Action Tokenization Perspective☆320Updated 4 months ago
- [CVPR 2025] The offical Implementation of "Universal Actions for Enhanced Embodied Foundation Models"☆211Updated last week