leofan90 / Awesome-World-Models
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related websites.
☆84Updated 3 weeks ago
Alternatives and similar repositories for Awesome-World-Models:
Users that are interested in Awesome-World-Models are comparing it to the libraries listed below
- [NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model☆65Updated 3 months ago
- Doe-1: Closed-Loop Autonomous Driving with Large World Model☆88Updated 2 months ago
- CoRL2024 | Hint-AD: Holistically Aligned Interpretability for End-to-End Autonomous Driving☆55Updated 5 months ago
- Awesome Papers about World Models in Autonomous Driving☆80Updated 11 months ago
- [RSS 2024] Learning Manipulation by Predicting Interaction☆101Updated 7 months ago
- Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning☆175Updated this week
- HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model☆130Updated this week
- [ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes☆113Updated last month
- ☆57Updated 7 months ago
- Benchmark and model for step-by-step reasoning in autonomous driving.☆38Updated 2 weeks ago
- A Multi-Modal Large Language Model with Retrieval-augmented In-context Learning capacity designed for generalisable and explainable end-t…☆88Updated 5 months ago
- Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives☆59Updated last month
- Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving☆79Updated last year
- ☆36Updated 3 weeks ago
- [NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation☆106Updated 3 months ago
- Simulator designed to generate diverse driving scenarios.☆40Updated last month
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆107Updated 8 months ago
- ☆51Updated last month
- [ECCV 2024] Asynchronous Large Language Model Enhanced Planner for Autonomous Driving☆72Updated last month
- Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)☆86Updated 3 months ago
- ☆12Updated 9 months ago
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆85Updated last month
- GPD-1: Generative Pre-training for Driving☆71Updated 3 months ago
- [ECCV 2024] The official code for "Dolphins: Multimodal Language Model for Driving“☆67Updated last month
- [ECCV 2024] Embodied Understanding of Driving Scenarios☆184Updated 2 months ago
- ☆42Updated last week
- Simulator-conditioned Driving Scene Generation☆105Updated last month
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆98Updated last week
- ☆22Updated 2 months ago
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆81Updated this week