[ACL2025 Oral & Award] Evaluate Image/Video Generation like Humans - Fast, Explainable, Flexible
☆122Aug 10, 2025Updated 7 months ago
Alternatives and similar repositories for Evaluation-Agent
Users that are interested in Evaluation-Agent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [Siggraph Asia 2025] Official code release of our paper "Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy"☆60Sep 26, 2025Updated 5 months ago
- Official Implementation of MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models☆12Nov 1, 2025Updated 4 months ago
- Official Implementation of OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation☆40Jul 5, 2025Updated 8 months ago
- [CVPR2025] Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion☆188Apr 3, 2025Updated 11 months ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆143Aug 21, 2025Updated 7 months ago
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.☆86May 4, 2025Updated 10 months ago
- A light-weight and high-efficient training framework for accelerating diffusion tasks.☆52Sep 14, 2024Updated last year
- A list of works on video generation towards world model☆433Updated this week
- This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehe…☆128Jan 29, 2026Updated last month
- [ICLR 2025] Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion☆293Nov 4, 2025Updated 4 months ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆87Jan 21, 2026Updated 2 months ago
- [CVPR2024 Highlight] VBench - We Evaluate Video Generation☆1,541Mar 16, 2026Updated last week
- Offical respority for Gait Recogniton with Drones: A benchmark (TMM 2023)☆10Feb 2, 2024Updated 2 years ago
- OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models☆19Feb 20, 2025Updated last year
- MovieAgent: Automated Movie Generation via Multi-Agent CoT Planning☆304Mar 26, 2025Updated 11 months ago
- [CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Images☆64Jan 23, 2026Updated 2 months ago
- [ NeurIPS 2024 D&B Track ] Implementation for "FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models"☆73Dec 27, 2024Updated last year
- ☆58Jan 30, 2026Updated last month
- ☆21Jul 22, 2025Updated 8 months ago
- ☆26Jun 22, 2024Updated last year
- Official implement of MIA-DPO☆72Jan 23, 2025Updated last year
- This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language M…☆24Apr 27, 2025Updated 10 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Jun 28, 2024Updated last year
- Chinese-native image generation while compatible with SD eco-system, 1st-gen, AAAI2025☆13Jun 25, 2024Updated last year
- ☆12Jun 20, 2023Updated 2 years ago
- [EMNLP’24 Main] Encoding and Controlling Global Semantics for Long-form Video Question Answering☆18Oct 9, 2024Updated last year
- [SIGGRAPH ASIA'25] BlobCtrl: Taming Controllable Blob for Element-level Image Editing☆25Nov 14, 2025Updated 4 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆52Feb 27, 2025Updated last year
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆35Aug 9, 2023Updated 2 years ago
- This repository is related to 'Intriguing Properties of Hyperbolic Embeddings in Vision-Language Models', published at TMLR (2024), https…☆22Jul 5, 2024Updated last year
- Repo for "Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content"☆41Jun 9, 2025Updated 9 months ago
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated last year
- ☆13Feb 11, 2021Updated 5 years ago
- Benchmarking and Analyzing Generative Data for Visual Recognition☆26Jul 25, 2023Updated 2 years ago
- Why do deep convolutional networks generalize so poorly to small image transformations?☆11Jun 23, 2019Updated 6 years ago
- ☆25May 28, 2025Updated 9 months ago
- Official codes for NAACL 2025 paper "LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias …☆11Nov 25, 2025Updated 3 months ago
- Code for CineScale, higher-resolution video generation based on Wan☆185Aug 25, 2025Updated 6 months ago