[ACL2025 Oral & Award] Evaluate Image/Video Generation like Humans - Fast, Explainable, Flexible
☆121Aug 10, 2025Updated 6 months ago
Alternatives and similar repositories for Evaluation-Agent
Users that are interested in Evaluation-Agent are comparing it to the libraries listed below
Sorting:
- [Siggraph Asia 2025] Official code release of our paper "Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy"☆58Sep 26, 2025Updated 5 months ago
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.☆85May 4, 2025Updated 9 months ago
- Official Implementation of OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation☆38Jul 5, 2025Updated 7 months ago
- This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Gener…☆17Jul 1, 2024Updated last year
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆141Aug 21, 2025Updated 6 months ago
- A light-weight and high-efficient training framework for accelerating diffusion tasks.☆51Sep 14, 2024Updated last year
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Jun 28, 2024Updated last year
- Repo for "Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content"☆40Jun 9, 2025Updated 8 months ago
- Official implement of MIA-DPO☆70Jan 23, 2025Updated last year
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated 11 months ago
- ☆16Jul 23, 2024Updated last year
- A list of works on video generation towards world model☆355Feb 11, 2026Updated 2 weeks ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆85Jan 21, 2026Updated last month
- Chinese-native image generation while compatible with SD eco-system, 1st-gen, AAAI2025☆13Jun 25, 2024Updated last year
- Repository for initial POC NLP based SQL adapter using LLM.☆10May 6, 2025Updated 9 months ago
- MovieAgent: Automated Movie Generation via Multi-Agent CoT Planning☆294Mar 26, 2025Updated 11 months ago
- Unifying Specialized Visual Encoders for Video Language Models☆25Nov 22, 2025Updated 3 months ago
- Official Implementation for "Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing"☆55Sep 12, 2024Updated last year
- [CVPR 2026] Thinking with Programming Vision: Towards a Unified View for Thinking with Images☆56Jan 23, 2026Updated last month
- ☆26Jun 22, 2024Updated last year
- ☆56Jan 30, 2026Updated last month
- [CVPRW 2025] UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inpu…☆105Apr 23, 2025Updated 10 months ago
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehe…☆121Jan 29, 2026Updated last month
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Dec 30, 2025Updated 2 months ago
- [ACM MM 2025] LMM4Edit: Benchmarking and Evaluating Multimodal Image Editing with LMMs☆15Feb 10, 2026Updated 3 weeks ago
- The code implementation of Symbolic-MoE☆46Sep 2, 2025Updated 6 months ago
- [ NeurIPS 2024 D&B Track ] Implementation for "FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models"☆73Dec 27, 2024Updated last year
- E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models☆39Jan 5, 2026Updated last month
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆46Jul 17, 2025Updated 7 months ago
- Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)☆104Dec 9, 2024Updated last year
- [ICLR 2025] Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion☆291Nov 4, 2025Updated 3 months ago
- [SIGGRAPH ASIA'25] BlobCtrl: Taming Controllable Blob for Element-level Image Editing☆26Nov 14, 2025Updated 3 months ago
- Offical respority for Gait Recogniton with Drones: A benchmark (TMM 2023)☆10Feb 2, 2024Updated 2 years ago
- A UI designer for constructing AI applications with OpenSearch☆16Updated this week
- Agent-OM: Leveraging LLM Agents for Ontology Matching☆18Jan 24, 2026Updated last month
- Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions☆21Feb 11, 2026Updated 2 weeks ago
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆52Dec 5, 2024Updated last year
- [CVPR2024 Highlight] VBench - We Evaluate Video Generation☆1,496Feb 23, 2026Updated last week