THUDM / VisionRewardLinks

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

☆278

Alternatives and similar repositories for VisionReward

Users that are interested in VisionReward are comparing it to the libraries listed below

Sorting:

NJU-PCALab / OpenVid-1M
[ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
☆313Updated last month
djghosh13 / geneval
GenEval: An object-focused framework for evaluating text-to-image alignment
☆323Updated 4 months ago
TIGER-AI-Lab / VideoScore
official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]
☆92Updated 5 months ago
VARGPT-family / VARGPT-v1.1
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning
☆258Updated 2 months ago
XueZeyue / DanceGRPO
☆433Updated this week
illume-unified-mllm / ILLUME_plus
☆111Updated 3 weeks ago
KwaiVGI / VideoAlign
Improving Video Generation with Human Feedback
☆220Updated 3 months ago
wdrink / SimpleAR
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
☆383Updated 3 weeks ago
Karine-Huang / T2I-CompBench
[Neurips 2023 & TPAMI] T2I-CompBench (++) for Compositional Text-to-image Generation Evaluation
☆271Updated 3 months ago
DCDmllm / AnyEdit
【CVPR 2025 Oral】Official Repo for Paper "AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea"
☆163Updated 3 months ago
AILab-CVC / CV-VAE
[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
☆276Updated 7 months ago
krennic999 / STAR
STAR: Scale-wise Text-to-image generation via Auto-Regressive representations
☆144Updated 4 months ago
WangWenhao0716 / VidProM
[NeurIPS 2024] VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
☆156Updated 9 months ago
YuqingWang1029 / PAR
[CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project
☆166Updated 3 months ago
KaiyueSun98 / T2V-CompBench
[CVPR 2025] T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
☆87Updated last month
mihirp1998 / VADER
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…
☆290Updated 4 months ago
RockeyCoss / SPO
[CVPR 2025] Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization
☆234Updated 3 months ago
rongyaofang / GoT
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
☆263Updated 2 months ago
evalcrafter / EvalCrafter
[CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
☆170Updated 9 months ago
End2End-Diffusion / REPA-E
[ICCV 2025] Official implementation of the paper: REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers
☆301Updated 2 months ago
PKU-YuanGroup / Open-Sora-Dataset
☆105Updated last year
Kwai-Kolors / MPS
☆176Updated last year
mihirp1998 / AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more samp…
☆291Updated 8 months ago
hutaiHang / ToMe
[NeurIPS 2024] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis
☆73Updated 5 months ago
baaivision / NOVA
[ICLR 2025] Autoregressive Video Generation without Vector Quantization
☆545Updated last month
yk7333 / d3po
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
☆233Updated last year
PKU-YuanGroup / ImgEdit
ImgEdit: A Unified Image Editing Dataset and Benchmark
☆138Updated last week
CodeGoat24 / LiFT
Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.
☆79Updated 2 months ago
mlpc-ucsd / TokenCompose
(CVPR 2024) 🧩 TokenCompose: Text-to-Image Diffusion with Token-level Supervision
☆126Updated 6 months ago
KwaiVGI / Koala-36M
Official implementation of the paper "Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Vi…
☆186Updated 3 months ago