leofan90 / Awesome-World-Models
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related websites.
☆75Updated this week
Alternatives and similar repositories for Awesome-World-Models:
Users that are interested in Awesome-World-Models are comparing it to the libraries listed below
- [RSS 2024] Learning Manipulation by Predicting Interaction☆100Updated 6 months ago
- Doe-1: Closed-Loop Autonomous Driving with Large World Model☆79Updated last month
- CoRL2024 | Hint-AD: Holistically Aligned Interpretability for End-to-End Autonomous Driving☆53Updated 3 months ago
- Awesome Papers about World Models in Autonomous Driving☆75Updated 9 months ago
- [NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model☆61Updated 2 months ago
- [NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation☆95Updated 2 months ago
- [ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes☆111Updated last month
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆76Updated 2 weeks ago
- A Multi-Modal Large Language Model with Retrieval-augmented In-context Learning capacity designed for generalisable and explainable end-t…☆83Updated 4 months ago
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆96Updated 7 months ago
- Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation☆86Updated last month
- ☆12Updated 8 months ago
- ☆48Updated last month
- Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives☆42Updated last month
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆83Updated 2 months ago
- Simulator-conditioned Driving Scene Generation☆91Updated last week
- List of papers on video-centric robot learning☆14Updated 3 months ago
- Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World☆124Updated 3 months ago
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆82Updated 2 months ago
- Code&Data for Grounded 3D-LLM with Referent Tokens☆98Updated last month
- Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning☆42Updated 3 weeks ago
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆35Updated 2 months ago
- Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving☆78Updated last year
- ☆55Updated 6 months ago
- GPD-1: Generative Pre-training for Driving☆68Updated 2 months ago
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆66Updated last week
- Simulator designed to generate diverse driving scenarios.☆40Updated 9 months ago
- Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)☆82Updated 2 months ago