albertwy / GPT-4V-Evaluation
Data for evaluating GPT-4V
☆11Updated 10 months ago
Related projects: ⓘ
- ☆11Updated 2 months ago
- ☆31Updated 3 months ago
- ☆49Updated last year
- ☆13Updated 10 months ago
- A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models☆31Updated 3 months ago
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆32Updated 10 months ago
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆28Updated last year
- An Easy-to-use Hallucination Detection Framework for LLMs.☆48Updated 4 months ago
- ☆22Updated last month
- [ACM MM 2022]: Multi-Modal Experience Inspired AI Creation☆18Updated 3 months ago
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆28Updated 5 months ago
- my commonly-used tools☆46Updated last month
- [ICML'2024] Can AI Assistants Know What They Don't Know?☆62Updated 7 months ago
- ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation☆80Updated 2 months ago
- An automatic MLLM hallucination detection framework☆17Updated 11 months ago
- ☆17Updated 2 months ago
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆21Updated 2 months ago
- Code and data for "Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue" (ACL 2024)☆20Updated last month
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆21Updated 3 months ago
- ☆73Updated 8 months ago
- [Arxiv] Calibrated Self-Rewarding Vision Language Models☆35Updated 3 months ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆128Updated 4 months ago
- This repository contains code to evaluate various multimodal large language models using different instructions across multiple multimoda…☆24Updated 4 months ago
- ☆20Updated 4 months ago
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆22Updated 2 weeks ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆71Updated 7 months ago
- ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without rely…☆46Updated last year
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆21Updated 2 months ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆40Updated 3 months ago
- Visual and Embodied Concepts evaluation benchmark☆21Updated 11 months ago