TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
☆181Apr 29, 2024Updated last year
Alternatives and similar repositories for tifa
Users that are interested in tifa are comparing it to the libraries listed below
Sorting:
- Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)☆104Dec 9, 2024Updated last year
- Code for the paper "If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection"☆27Jul 10, 2023Updated 2 years ago
- Evaluating text-to-image/video/3D models with VQAScore☆377Sep 22, 2025Updated 5 months ago
- Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis☆647May 24, 2024Updated last year
- Repo for our NeurIPS 2023 paper on: Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Fee…☆27Nov 11, 2023Updated 2 years ago
- ☆580Dec 21, 2024Updated last year
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆81Apr 10, 2024Updated last year
- [Neurips 2023 & TPAMI] T2I-CompBench (++) for Compositional Text-to-image Generation Evaluation☆333Dec 24, 2025Updated 2 months ago
- [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation☆1,641Oct 29, 2025Updated 4 months ago
- RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with t…☆153Jun 25, 2024Updated last year
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆45Nov 29, 2023Updated 2 years ago
- Training code for CLIP-FlanT5☆30Jul 29, 2024Updated last year
- AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more samp…☆314Nov 1, 2024Updated last year
- Better Aligning Text-to-Image Models with Human Preference. ICCV 2023☆294Jul 14, 2023Updated 2 years ago
- [NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin L…☆57Mar 4, 2024Updated 2 years ago
- [NeurIPS 2024] EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.☆52Oct 14, 2024Updated last year
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆311Mar 12, 2025Updated 11 months ago
- ☆37Oct 7, 2023Updated 2 years ago
- ☆14Jul 5, 2024Updated last year
- Implementation of MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path☆67Jun 23, 2023Updated 2 years ago
- LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation☆134Oct 25, 2023Updated 2 years ago
- [CVPR2024 Highlight] VBench - We Evaluate Video Generation☆1,496Feb 23, 2026Updated last week
- Code for "DreamEdit: Subject-driven Image Editing" (TMLR2023)☆109Jan 23, 2024Updated 2 years ago
- [NeurIPS 2023] A faithful benchmark for vision-language compositionality☆89Feb 13, 2024Updated 2 years ago
- [CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models☆191Oct 3, 2024Updated last year
- VisualGPTScore for visio-linguistic reasoning☆27Oct 7, 2023Updated 2 years ago
- [NeurIPS 2024] 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching☆168Nov 18, 2024Updated last year
- DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data (NeurIPS 2023 Spotlight) / / / / When Does Perceptual A…☆585Nov 24, 2025Updated 3 months ago
- Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning☆316Jul 11, 2024Updated last year
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆31May 29, 2023Updated 2 years ago
- An innovative method designed to augment the capabilities of existing video diffusion models☆22May 10, 2024Updated last year
- Official GitHub repository for the Text-Guided Video Editing (TGVE) competition of LOVEU Workshop @ CVPR'23.☆78Oct 25, 2023Updated 2 years ago
- [NeurIPS 2023] Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator☆98Mar 18, 2024Updated last year
- Official implementation of SEED-LLaMA (ICLR 2024).☆642Sep 21, 2024Updated last year
- Code for "Diffusion Model Alignment Using Direct Preference Optimization"☆667Nov 10, 2025Updated 3 months ago
- [ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data☆14Sep 30, 2023Updated 2 years ago
- DreamDance: Personalized Text-to-video Generation by Combining Text-to-Image Synthesis and Motion Transfer☆14Dec 16, 2022Updated 3 years ago
- A instruction data generation system for multimodal language models.☆35Jan 31, 2025Updated last year
- Official Implementation for "A Neural Space-Time Representation for Text-to-Image Personalization" (SIGGRAPH Asia 2023)☆181Sep 19, 2023Updated 2 years ago