Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024)
☆68Nov 19, 2024Updated last year
Alternatives and similar repositories for VIEScore
Users that are interested in VIEScore are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] HQ-Edit: A High-Quality and High-Coverage Dataset for General Image Editing☆113Apr 18, 2024Updated last year
- [ACM Multimedia 2025 Datasets Track] EditWorld: Simulating World Dynamics for Instruction-Following Image Editing☆139Aug 2, 2025Updated 6 months ago
- Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-L…☆37Apr 13, 2022Updated 3 years ago
- Code and Data for "GenAI Arena: An Open Evaluation Platform for Generative Models" [NeurIPS 2024]☆34Sep 8, 2024Updated last year
- Repo for our work "Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence"☆19Jun 2, 2025Updated 8 months ago
- (ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆48Jul 1, 2025Updated 8 months ago
- Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)☆104Dec 9, 2024Updated last year
- [NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"☆26Nov 21, 2025Updated 3 months ago
- ☆11Jul 26, 2024Updated last year
- ☆11Jun 21, 2025Updated 8 months ago
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆32Dec 9, 2025Updated 2 months ago
- FreeCond: A Free Lunch for Input Conditions in Text-Guided Inpainting. FreeCond introduces a more generalized form💪 of the original inpa…☆15May 22, 2025Updated 9 months ago
- Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks [ICLR 2026]☆26Feb 22, 2026Updated last week
- [CVPR 2024] KEPP: Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos☆12Sep 24, 2024Updated last year
- This repository contains the code for CVPRW 2024 paper: Generating Material-Aware 3D Models from Sparse Views☆13Jun 11, 2024Updated last year
- Training code for CLIP-FlanT5☆30Jul 29, 2024Updated last year
- Repo for "Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content"☆40Jun 9, 2025Updated 8 months ago
- Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"☆53Jul 5, 2025Updated 7 months ago
- The official repository of paper named 'A Refer-and-Ground Multimodal Large Language Model for Biomedicine'☆34Nov 5, 2024Updated last year
- ☆13Feb 26, 2025Updated last year
- [CVPR 2025] AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM☆18Aug 6, 2025Updated 6 months ago
- ☆22Nov 25, 2025Updated 3 months ago
- [NeurIPS 2025 Spotlight] VisualQuality-R1 is the first open-sourced NR-IQA model can accurately describe and rate the image quality.☆157Oct 15, 2025Updated 4 months ago
- ☆41Sep 9, 2025Updated 5 months ago
- ☆32Oct 18, 2024Updated last year
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]☆113Dec 4, 2025Updated 2 months ago
- ☆64Jul 10, 2025Updated 7 months ago
- ☆35Nov 5, 2024Updated last year
- ☆19Sep 19, 2024Updated last year
- A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.☆19Feb 15, 2024Updated 2 years ago
- Official Code for "Intelligent Painter: Picture Composition With Resampling Diffusion Model" (ICIP 2023)☆17Jun 23, 2023Updated 2 years ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Jul 30, 2024Updated last year
- Evaluating text-to-image/video/3D models with VQAScore☆377Sep 22, 2025Updated 5 months ago
- TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering☆181Apr 29, 2024Updated last year
- ☆578Dec 21, 2024Updated last year
- [CVPR`2024, Oral] Attention Calibration for Disentangled Text-to-Image Personalization☆109Apr 10, 2024Updated last year
- Official repository for the paper "MICo-150K: A Comprehensive Dataset for Multi-Image Composition".☆54Feb 21, 2026Updated last week
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation☆39Jul 22, 2025Updated 7 months ago
- Geometry-aware Novel View Synthesis with Pre-trained 2D Prior☆39Jun 3, 2023Updated 2 years ago