YanqiDai / MMRole
A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents
☆31Updated this week
Related projects ⓘ
Alternatives and complementary repositories for MMRole
- ☆69Updated 6 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆38Updated 4 months ago
- A Survey on Benchmarks of Multimodal Large Language Models☆64Updated last month
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆56Updated this week
- ☆74Updated 8 months ago
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆69Updated 2 months ago
- ☆38Updated 5 months ago
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆102Updated last month
- An Easy-to-use Hallucination Detection Framework for LLMs.☆48Updated 7 months ago
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆38Updated last week
- ☆21Updated last month
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆46Updated 2 weeks ago
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆75Updated last month
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆62Updated 3 weeks ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆87Updated 2 weeks ago
- Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions (NeurIPS 2024)☆144Updated 3 months ago
- Code & Dataset for Paper: "Distill Visual Chart Reasoning Ability from LLMs to MLLMs"☆30Updated 3 weeks ago
- Official repository of MMDU dataset☆75Updated last month
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆55Updated last month
- ☆17Updated 4 months ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆46Updated 7 months ago
- GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes fr…☆69Updated last week
- ☆32Updated 5 months ago
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆55Updated last month
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆30Updated this week
- This the implementation of LeCo☆27Updated 4 months ago
- ☆58Updated 9 months ago
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆33Updated last year
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆115Updated last week
- [CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback☆235Updated 2 months ago