leofan90 / Awesome-World-ModelsLinks
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related websites.
☆143Updated this week
Alternatives and similar repositories for Awesome-World-Models
Users that are interested in Awesome-World-Models are comparing it to the libraries listed below
Sorting:
- [RSS 2024] Learning Manipulation by Predicting Interaction☆107Updated 9 months ago
- HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model☆230Updated last month
- [RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions☆351Updated this week
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆95Updated 3 months ago
- [NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation☆114Updated 6 months ago
- RoboDual: Dual-System for Robotic Manipulation☆80Updated last month
- ☆68Updated 3 weeks ago
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆129Updated 2 months ago
- Latest Advances on Vison-Language-Action Models.☆60Updated 3 months ago
- ☆388Updated last year
- [CVPR 2025] The offical Implementation of "Universal Actions for Enhanced Embodied Foundation Models"☆172Updated 2 months ago
- ☆178Updated last month
- Awesome Papers about World Models in Autonomous Driving☆80Updated last year
- [ICLR 2025] LAPA: Latent Action Pretraining from Videos☆293Updated 4 months ago
- [CVPR 2025]Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆142Updated 3 months ago
- ☆54Updated 3 months ago
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆114Updated 11 months ago
- Official code for the CVPR 2025 paper "Navigation World Models".☆185Updated last month
- Online RL with Simple Reward Enables Training VLA Models with Only One Trajectory☆157Updated last week
- Nexus: Decoupled Diffusion Sparks Adaptive Scene Generation☆56Updated last week
- Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"☆257Updated last year
- Official PyTorch Implementation of Unified Video Action Model (RSS 2025)☆203Updated 2 months ago
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.☆227Updated last week
- Single-file implementation to advance vision-language-action (VLA) models with reinforcement learning.☆96Updated 2 weeks ago
- Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning☆231Updated 2 months ago
- [RSS 2024] Code for "Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals" for CALVIN experiments with pre…☆140Updated 7 months ago
- [ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation☆186Updated last month
- [ICML 2024] The offical Implementation of "DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning"☆80Updated last week
- The Official Implementation of RoboMatrix☆91Updated 2 weeks ago
- RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints☆48Updated last week