csebuetnlp / IllusionVQA
This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models"
☆13Updated 5 months ago
Alternatives and similar repositories for IllusionVQA:
Users that are interested in IllusionVQA are comparing it to the libraries listed below
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆115Updated 9 months ago
- [ECCV 2024] Official Release of SILC: Improving vision language pretraining with self-distillation☆40Updated 6 months ago
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆69Updated 3 months ago
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆20Updated 7 months ago
- Official implementation of the paper The Hidden Language of Diffusion Models☆72Updated last year
- [ECCV 2024 Oral] ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction☆64Updated 7 months ago
- ☆35Updated 6 months ago
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆34Updated 4 months ago
- Code for the paper "Data Attribution for Text-to-Image Models by Unlearning Synthesized Images."☆13Updated last month
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆44Updated last year
- Official Repository of Personalized Visual Instruct Tuning☆28Updated 3 weeks ago
- Awesome Instruction Editing. Image and Media Editing with Human Instructions. Instruction-Guided Image and Media Editing.☆31Updated 3 months ago
- 🦾 EvalGIM (pronounced as "EvalGym") is an evaluation library for generative image models. It enables easy-to-use, reproducible automatic…☆68Updated 3 months ago
- This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR2025]☆60Updated this week
- [CVPR 2024 Highlight] OpenBias: Open-set Bias Detection in Text-to-Image Generative Models☆23Updated last month
- ☆37Updated 8 months ago
- Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation☆24Updated 2 months ago
- GeckoNum Benchmark for T2I Model Eval.☆11Updated 3 months ago
- Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)☆117Updated 11 months ago
- ☆21Updated last year
- ☆13Updated 6 months ago
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆97Updated last week
- TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering☆157Updated 11 months ago
- ☆31Updated last year
- [CVPR24 Highlights] Polos: Multimodal Metric Learning from Human Feedback for Image Captioning☆29Updated 8 months ago
- [ECCV 2024] Official PyTorch implementation of "Getting it Right: Improving Spatial Consistency in Text-to-Image Models"☆100Updated 8 months ago
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆38Updated last week
- [ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation☆31Updated 2 months ago
- [CVPR 2024] Official PyTorch implementation of "ECLIPSE: Revisiting the Text-to-Image Prior for Efficient Image Generation"☆62Updated 11 months ago
- PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT☆62Updated 2 weeks ago