GuochenZhou / World-ModelLinks

A paper list of world model

☆29

Alternatives and similar repositories for World-Model

Users that are interested in World-Model are comparing it to the libraries listed below

Sorting:

PatrickHua / Awesome-World-Models
This repository is a collection of research papers on World Models.
☆41Updated 2 years ago
OpenGVLab / EmbodiedGPT
☆33Updated 2 years ago
thuml / ContextWM
Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://ar…
☆67Updated last year
haosulab / cvpr-tutorial-2022
☆44Updated 3 years ago
ykarmesh / stable-control-representations
Code for Stable Control Representations
☆26Updated 6 months ago
OpenGVLab / VeBrain
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces
☆84Updated 4 months ago
InternRobotics / InternVLA-A1
InternVLA-A1: Unifying Understanding, Generation, and Action for Robotic Manipulation
☆51Updated last month
video-language-planning / vlp_code
☆77Updated 5 months ago
xvjiarui / IMProv
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
☆57Updated last year
declare-lab / Emma-X
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning
☆74Updated 5 months ago
rainbow979 / robodreamer
☆82Updated last year
GR1-Manipulation / GR-1
Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"
☆44Updated last year
Jaraxxus-Me / LogiCity
LogiCity@NeurIPS'24, D&B track. A multi-agent inductive learning environment for "abstractions".
☆26Updated 4 months ago
Biscue5 / EgoScaler
[CVPR 2025 highlight] Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision
☆30Updated this week
joyhsu0504 / LEFT
☆46Updated last year
yanx27 / CLEVR3D
CLEVR3D Dataset: Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation
☆19Updated last year
kyegomez / awesome-robotic-foundation-models
A vast array of Multi-Modal Embodied Robotic Foundation Models!
☆26Updated last year
Boyiliee / ITP-BobaRobot
Code for "Interactive Task Planning with Language Models"
☆32Updated 6 months ago
UMass-Embodied-AGI / MultiPLY
Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
☆133Updated last year
metadriverse / ACO
[ECCV 2022] Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining
☆85Updated 2 years ago
mihirp1998 / Slot-TTA
Slot-TTA shows that test-time adaptation using slot-centric models can improve image segmentation on out-of-distribution examples.
☆26Updated 2 years ago
anuragajay / hip
Codebase for HiP
☆90Updated last year
EmbodiedGPT / EgoCOT_Dataset
☆54Updated last year
microsoft / smart
Codebase for ICLR 2023 paper, "SMART: Self-supervised Multi-task pretrAining with contRol Transformers"
☆54Updated last year
USC-GVL / PhysBench
[ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …
☆73Updated 5 months ago
raunaqbhirangi / hiss
Hierarchical State Space Models
☆47Updated last year
OpenDriveLab / MPI
[RSS 2024] Learning Manipulation by Predicting Interaction
☆115Updated 4 months ago
liufanfanlff / RoboUniview
☆60Updated 8 months ago
OpenHelix-Team / VLA-RFT
VLA-RFT: Vision-Language-Action Models with Reinforcement Fine-Tuning
☆66Updated 3 weeks ago
allenai / interactron
A Model for Embodied Adaptive Object Detection
☆46Updated 3 years ago