OpenGVLab / PhyGenBenchLinks

[ICML2025] The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

☆116

Alternatives and similar repositories for PhyGenBench

Users that are interested in PhyGenBench are comparing it to the libraries listed below

Sorting:

Hritikbansal / videophy
Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics
☆129Updated 2 months ago
wusize / Harmon
[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
☆145Updated 2 months ago
PKU-YuanGroup / WISE
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
☆136Updated last month
CodeGoat24 / LiFT
Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.
☆79Updated 3 months ago
vision-x-nyu / pisa-experiments
Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)
☆39Updated 2 months ago
TIGER-AI-Lab / VideoScore
official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]
☆94Updated 5 months ago
ziqipang / RandAR
[CVPR 2025 (Oral)] Open implementation of "RandAR"
☆182Updated 3 weeks ago
rongyaofang / PUMA
Empowering Unified MLLM with Multi-granular Visual Generation
☆127Updated 6 months ago
showlab / FAR
Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"
☆234Updated 3 months ago
gogoduan / GoT-R1
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning
☆94Updated 2 months ago
KaiyueSun98 / T2V-CompBench
[CVPR 2025] T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
☆90Updated 2 months ago
facebookresearch / metaquery
Official Implementation of Paper Transfer between Modalities with MetaQueries
☆186Updated 2 weeks ago
KwaiVGI / DiffMoE
PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT
☆121Updated 3 months ago
pittisl / PhyT2V
official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation
☆40Updated this week
xizaoqu / MOFT
[Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller
☆45Updated 3 months ago
selftok-team / SelftokTokenizer
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning
☆198Updated 2 months ago
Franklin-Zhang0 / ReasonGen-R1
Official respository for ReasonGen-R1
☆56Updated last month
wdrink / SimpleAR
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
☆390Updated last month
rongyaofang / GoT
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
☆272Updated 3 months ago
CIntellifusion / VideoDPO
Official Implementation of VideoDPO
☆130Updated 2 months ago
DuNGEOnmassster / VideoGen-of-Thought
Official Implementation of VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention
☆39Updated 3 months ago
facebookresearch / metamorph
Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning
☆199Updated 3 months ago
wusize / OpenUni
☆144Updated last month
aHapBean / VideoREPA
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models
☆54Updated last month
csuhan / Tar
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
☆129Updated 3 weeks ago
PhoenixZ810 / RISEBench
Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
☆79Updated 2 weeks ago
SilentView / LVD-2M
[NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"
☆65Updated 9 months ago
lxa9867 / ControlVAR
This is the official implementation for ControlVAR.
☆117Updated 7 months ago
NJU-PCALab / OpenVid-1M
[ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
☆325Updated 2 months ago
yuhuUSTC / FAR
Frequency Autoregressive Image Generation with Continuous Tokens
☆81Updated last month