allenai / molmoactLinks
Official Repository for MolmoAct
☆224Updated last week
Alternatives and similar repositories for molmoact
Users that are interested in molmoact are comparing it to the libraries listed below
Sorting:
- Unfied World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets☆134Updated 2 weeks ago
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)☆278Updated 3 months ago
- A Vision-Language Model for Spatial Affordance Prediction in Robotics☆195Updated 3 months ago
- Official Repository for SAM2Act☆208Updated 2 months ago
- [ICML 2025] OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction☆106Updated 6 months ago
- A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks☆149Updated last month
- Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos☆168Updated last month
- ☆134Updated 11 months ago
- ☆226Updated last year
- Code for subgoal synthesis via image editing☆143Updated 2 years ago
- Nvidia GEAR Lab's initiative to solve the robotics data problem using world models☆340Updated this week
- Official implementation of "Data Scaling Laws in Imitation Learning for Robotic Manipulation"☆192Updated 11 months ago
- [ICRA 2025] In-Context Imitation Learning via Next-Token Prediction☆96Updated 7 months ago
- AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World | CoRL 2025☆81Updated 4 months ago
- ☆57Updated 9 months ago
- Official implementation of "OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning"☆191Updated 4 months ago
- VLAC: A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning☆192Updated 3 weeks ago
- Official Reporsitory of "RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation"☆126Updated 4 months ago
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆143Updated 6 months ago
- Official implementation of "Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy."☆110Updated last month
- ☆78Updated last month
- Official codebase for "Any-point Trajectory Modeling for Policy Learning"☆256Updated 4 months ago
- Distributed Robot Interaction Dataset.☆259Updated last month
- ☆93Updated last week
- A unified architecture for multimodal multi-task robotic policy learning.☆167Updated last year
- Interactive Post-Training for Vision-Language-Action Models☆134Updated 4 months ago
- ☆266Updated last year
- code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation☆94Updated last year
- [CoRL 2024] Im2Flow2Act: Flow as the Cross-domain Manipulation Interface☆140Updated last year
- dataloading is my passion☆65Updated last year