WorldModelBench-Team / WorldModelBenchLinks
☆21Updated 3 months ago
Alternatives and similar repositories for WorldModelBench
Users that are interested in WorldModelBench are comparing it to the libraries listed below
Sorting:
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆76Updated 3 months ago
- Code for ICML 2025 Paper "Highly Compressed Tokenizer Can Generate Without Training"☆80Updated 2 weeks ago
- [CVPR 25] A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.☆30Updated 3 months ago
- Memory Efficient Training Framework for Large Video Generation Model☆25Updated last year
- Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces☆70Updated 3 weeks ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆36Updated 4 months ago
- The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"☆51Updated 2 months ago
- Official Implementation of Paper Transfer between Modalities with MetaQueries☆65Updated this week
- Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations☆48Updated this week
- Sora Generates Videos with Stunning Geometrical Consistency☆50Updated last year
- ☆19Updated 2 months ago
- Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation☆111Updated this week
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller☆41Updated 2 months ago
- PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT☆114Updated 2 months ago
- ☆50Updated 6 months ago
- A list of works on video generation towards world model☆154Updated last week
- the official repo for "D-AR: Diffusion via Autoregressive Models"☆98Updated last week
- FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.☆46Updated 11 months ago
- [ICLR 2025] Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding☆39Updated 2 months ago
- This repository provides the official implementation of VTBench, a benchmark designed to evaluate the performance of visual tokenizers (V…☆28Updated 3 weeks ago
- ☆30Updated 6 months ago
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"☆33Updated 4 months ago
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆36Updated last month
- ☆37Updated 2 weeks ago
- ☆84Updated last week
- ☆37Updated last month
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆123Updated last month
- VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models☆51Updated 3 weeks ago
- ☆38Updated last week
- ☆36Updated 4 months ago