GigaAI-research / General-World-Models-SurveyLinks

☆464

Alternatives and similar repositories for General-World-Models-Survey

Users that are interested in General-World-Models-Survey are comparing it to the libraries listed below

Sorting:

leofan90 / Awesome-World-Models
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and A…
☆784Updated this week
alibaba-damo-academy / RynnVLA-002
WorldVLA: Towards Autoregressive Action World Model
☆560Updated this week
vision-x-nyu / thinking-in-space
Official repo and evaluation implementation of VSI-Bench
☆638Updated 3 months ago
facebookresearch / nwm
Official code for the CVPR 2025 paper "Navigation World Models".
☆448Updated this week
OpenDriveLab / Vista
[NeurIPS 2024] A Generalizable World Model for Autonomous Driving
☆821Updated 4 months ago
chaytonmin / Awesome-Papers-World-Models-Autonomous-Driving
Awesome Papers about World Models in Autonomous Driving
☆86Updated last year
tulerfeng / Awesome-Embodied-Multimodal-LLMs
Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).
☆121Updated last year
InternRobotics / EmbodiedScan
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
☆640Updated 5 months ago
HaoranZhuExplorer / World-Models-Autonomous-Driving-Latest-Survey
A curated list of world models for autonomous driving. Keep updated.
☆423Updated last week
embodied-generalist / embodied-generalist
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
☆466Updated 7 months ago
SenseTime-FVG / OpenDWM
An open source code repository of driving world models, with training, inferencing, evaluation tools, and pretrained checkpoints.
☆323Updated 5 months ago
tsinghua-fib-lab / World-Model
[ACM CSUR 2025] Understanding World or Predicting Future? A Comprehensive Survey of World Models
☆207Updated last week
YvanYin / DrivingWorld
Code for "DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT"
☆222Updated 10 months ago
AnjieCheng / SpatialRGPT
[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"
☆284Updated 11 months ago
LatentActionPretraining / LAPA
[ICLR 2025] LAPA: Latent Action Pretraining from Videos
☆407Updated 10 months ago
InternRobotics / InternVLA-M1
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
☆286Updated 2 weeks ago
hustvl / AlphaDrive
Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning
☆298Updated 8 months ago
baaivision / UniVLA
Unified Vision-Language-Action Model
☆233Updated last month
haoranD / Awesome-Embodied-AI
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
☆485Updated 5 months ago
liruiw / HPT
Heterogeneous Pre-trained Transformer (HPT) as Scalable Policy Learner.
☆520Updated 11 months ago
Zhoues / RoboRefer
[NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"
☆202Updated last month
diankun-wu / Spatial-MLLM
Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
☆387Updated 5 months ago
UMass-Embodied-AGI / 3D-VLA
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
☆595Updated last year
starVLA / starVLA
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
☆492Updated last week
f1yfisher / DriveDreamer2
[AAAI 2025] DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation
☆210Updated 8 months ago
BAAI-DCAI / SpatialBot
The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.
☆317Updated 2 months ago
metadriverse / metaurban
[ICLR 2025 Spotlight] MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility
☆211Updated last month
PzySeere / MetaSpatial
MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …
☆193Updated 6 months ago
USC-GVL / Agent-Driver
A Language Agent for Autonomous Driving
☆287Updated last year
ZCMax / LLaVA-3D
[ICCV 2025] A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
☆352Updated last month