leofan90 / Awesome-World-Models
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related websites.
☆84Updated 3 weeks ago
Alternatives and similar repositories for Awesome-World-Models:
Users that are interested in Awesome-World-Models are comparing it to the libraries listed below
- [RSS 2024] Learning Manipulation by Predicting Interaction☆101Updated 7 months ago
- [NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model☆65Updated 3 months ago
- HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model☆130Updated last week
- CoRL2024 | Hint-AD: Holistically Aligned Interpretability for End-to-End Autonomous Driving☆55Updated 5 months ago
- Awesome Papers about World Models in Autonomous Driving☆79Updated 11 months ago
- Doe-1: Closed-Loop Autonomous Driving with Large World Model☆87Updated 2 months ago
- [NeurIPS 2024] CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation☆107Updated 3 months ago
- Benchmark and model for step-by-step reasoning in autonomous driving.☆38Updated 2 weeks ago
- Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning☆175Updated last week
- [ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes☆113Updated last month
- GRAPE: Guided-Reinforced Vision-Language-Action Preference Optimization☆101Updated last week
- A Multi-Modal Large Language Model with Retrieval-augmented In-context Learning capacity designed for generalisable and explainable end-t…☆88Updated 5 months ago
- ☆12Updated 9 months ago
- ☆51Updated last month
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆85Updated last month
- ☆17Updated 2 weeks ago
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆107Updated 8 months ago
- ☆58Updated 7 months ago
- Simulator designed to generate diverse driving scenarios.☆40Updated last month
- Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving☆79Updated last year
- ☆42Updated last week
- GPD-1: Generative Pre-training for Driving☆71Updated 3 months ago
- [ECCV 2024] The official code for "Dolphins: Multimodal Language Model for Driving“☆67Updated last month
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆94Updated 4 months ago
- The repo of paper `RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation`☆103Updated 3 months ago
- ☆36Updated last month
- Simulator-conditioned Driving Scene Generation☆105Updated last month
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆19Updated this week
- ☆22Updated 2 months ago
- Latest Advances on Vison-Language-Action Models.☆30Updated 3 weeks ago