M-E-AGI-Lab / Awesome-World-ModelsLinks
Official Repo of From Masks to Worlds: A Hitchhiker’s Guide to World Models.
☆34Updated this week
Alternatives and similar repositories for Awesome-World-Models
Users that are interested in Awesome-World-Models are comparing it to the libraries listed below
Sorting:
- A list of works on video generation towards world model☆170Updated 2 weeks ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆158Updated 2 weeks ago
- ☆51Updated 2 months ago
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆43Updated 5 months ago
- Official Implementation of Paper Transfer between Modalities with MetaQueries☆257Updated 2 weeks ago
- [NeurIPS 2025] VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models☆82Updated 2 weeks ago
- MetaSpatial leverages reinforcement learning to enhance 3D spatial reasoning in vision-language models (VLMs), enabling more structured, …☆191Updated 5 months ago
- [ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆130Updated last year
- ☆30Updated 10 months ago
- ☆94Updated 3 weeks ago
- ☆149Updated 9 months ago
- Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision☆160Updated last month
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆91Updated 7 months ago
- [NeurIPS 2025] WorldMem: Long-term Consistent World Simulation with Memory☆253Updated this week
- [Nips 2025] EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆121Updated 2 months ago
- SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning☆92Updated 3 months ago
- Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation☆159Updated 3 months ago
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"☆140Updated last week
- Official repository of PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning☆47Updated 2 weeks ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆81Updated 8 months ago
- [NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding☆64Updated last month
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆46Updated 2 months ago
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆157Updated 2 weeks ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆73Updated 4 months ago
- ☆53Updated 2 months ago
- Official repository for the UAE paper, unified-GRPO, and unified-Bench☆142Updated last month
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller☆48Updated 2 months ago
- ☆99Updated 3 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆159Updated last month
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆126Updated 2 months ago