IranQin / Awesome_World_Model_PapersLinks
[World-Model-Survey-2024] Paper list and projects for World Model
☆11Updated 7 months ago
Alternatives and similar repositories for Awesome_World_Model_Papers
Users that are interested in Awesome_World_Model_Papers are comparing it to the libraries listed below
Sorting:
- [ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆107Updated 7 months ago
- Latent Motion Token as the Bridging Language for Robot Manipulation☆89Updated 3 weeks ago
- RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints☆48Updated last week
- Single-file implementation to advance vision-language-action (VLA) models with reinforcement learning.☆96Updated 2 weeks ago
- [arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence☆26Updated this week
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆106Updated last month
- Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223☆133Updated 2 weeks ago
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆34Updated 3 weeks ago
- ☆99Updated 2 weeks ago
- List of papers on video-centric robot learning☆20Updated 6 months ago
- ☆89Updated 3 weeks ago
- ☆30Updated 6 months ago
- Code for FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks☆66Updated 5 months ago
- ☆72Updated 9 months ago
- AnyBimanual: Transfering Unimanual Policy for General Bimanual Manipulation☆75Updated 2 months ago
- ☆46Updated 5 months ago
- A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.☆110Updated this week
- ☆129Updated 5 months ago
- Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.☆227Updated last week
- Official implementation of "Self-Improving Video Generation"☆66Updated last month
- [ICLR 2025 Spotlight] Grounding Video Models to Actions through Goal Conditioned Exploration☆48Updated last month
- ☆39Updated this week
- [CVPR2024] This is the official implement of MP5☆102Updated 11 months ago
- official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation☆31Updated 2 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆105Updated this week
- [ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models☆33Updated last year
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆104Updated 6 months ago
- Official repository of Learning to Act from Actionless Videos through Dense Correspondences.☆216Updated last year
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆49Updated this week
- [ICML 2024] A Touch, Vision, and Language Dataset for Multimodal Alignment☆78Updated this week