RoyZry98 / MoLe-VLA-Pytorch
[Arxiv 2025: MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation]
☆29Updated last month
Alternatives and similar repositories for MoLe-VLA-Pytorch
Users that are interested in MoLe-VLA-Pytorch are comparing it to the libraries listed below
Sorting:
- ☆83Updated last week
- HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model☆203Updated 2 weeks ago
- OpenHelix: An Open-source Dual-System VLA Model for Robotic Manipulation☆126Updated this week
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆91Updated 3 months ago
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆119Updated last month
- ☆75Updated 3 weeks ago
- [CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆139Updated 2 months ago
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)☆189Updated last month
- [ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction☆76Updated last month
- SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation☆149Updated last week
- [RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre…☆135Updated 6 months ago
- Latent Motion Token as the Bridging Language for Robot Manipulation☆85Updated this week
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆119Updated 4 months ago
- ☆46Updated 5 months ago
- Official implementation of GR-MG☆79Updated 4 months ago
- [ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation☆173Updated 3 weeks ago
- [RSS 2024] Learning Manipulation by Predicting Interaction☆107Updated 8 months ago
- A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and A…☆110Updated last week
- AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation☆71Updated last month
- ☆62Updated 2 months ago
- [RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions☆71Updated this week
- Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning☆64Updated last week
- Latest Advances on Vison-Language-Action Models.☆46Updated 2 months ago
- ☆52Updated 2 months ago
- RoboDual: Dual-System for Robotic Manipulation☆72Updated 2 weeks ago
- [CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning☆32Updated last month
- [ICLR 2025] LAPA: Latent Action Pretraining from Videos☆248Updated 3 months ago
- ☆116Updated 2 months ago
- ManiBox: Enhancing Spatial Grasping Generalization via Scalable Simulation Data Generation☆43Updated last month
- ☆74Updated 2 weeks ago