allenai / aokvqaView external linksLinks
Official repository for the A-OKVQA dataset
☆109May 8, 2024Updated last year
Alternatives and similar repositories for aokvqa
Users that are interested in aokvqa are comparing it to the libraries listed below
Sorting:
- Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"☆19Oct 4, 2022Updated 3 years ago
- PyTorch DataLoader for many VQA datasets☆14Jan 10, 2023Updated 3 years ago
- An automatic MLLM hallucination detection framework☆19Sep 26, 2023Updated 2 years ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆159Sep 27, 2025Updated 4 months ago
- ☆43Aug 15, 2023Updated 2 years ago
- ☆68Oct 27, 2023Updated 2 years ago
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆43Jun 7, 2025Updated 8 months ago
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆360Jan 14, 2025Updated last year
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆105Aug 21, 2025Updated 5 months ago
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆20Jan 11, 2026Updated last month
- [CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering☆20Sep 21, 2024Updated last year
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆108May 29, 2025Updated 8 months ago
- [ACL 2025 Findings] Official pytorch implementation of "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vis…☆24Jul 21, 2024Updated last year
- ☆63Jan 3, 2025Updated last year
- MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)☆321Jan 20, 2025Updated last year
- Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model☆13Feb 15, 2024Updated last year
- ☆360Jan 27, 2024Updated 2 years ago
- ☆390Mar 11, 2021Updated 4 years ago
- ☆93Mar 29, 2019Updated 6 years ago
- Dataset and starting code for visual entailment dataset☆118Apr 21, 2022Updated 3 years ago
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆22Jan 1, 2025Updated last year
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …☆424Dec 22, 2024Updated last year
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating☆97Jan 29, 2024Updated 2 years ago
- Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".☆725Sep 19, 2024Updated last year
- Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model☆281Jun 25, 2024Updated last year
- ☆24Oct 9, 2023Updated 2 years ago
- Long Context Transfer from Language to Vision☆398Mar 18, 2025Updated 10 months ago
- Official Repo of "CIBench: Evaluation of LLMs as Code Interpreter "☆14Jul 19, 2024Updated last year
- Code for paper "Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning"☆46Sep 8, 2023Updated 2 years ago
- implementation for Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering☆10Mar 17, 2022Updated 3 years ago
- [ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning☆70Sep 20, 2025Updated 4 months ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25May 30, 2024Updated last year
- MLLM-DataEngine: An Iterative Refinement Approach for MLLM☆48May 24, 2024Updated last year
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆296Mar 13, 2024Updated last year
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆245Aug 21, 2025Updated 5 months ago
- [CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(…☆325Oct 14, 2025Updated 3 months ago
- Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"☆11Jan 10, 2025Updated last year
- ☆32Jan 25, 2026Updated 2 weeks ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆155Apr 30, 2024Updated last year