pliang279 / HEMM
Holistic evaluation of multimodal foundation models
☆42Updated 5 months ago
Alternatives and similar repositories for HEMM:
Users that are interested in HEMM are comparing it to the libraries listed below
- Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".☆38Updated 10 months ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆78Updated 8 months ago
- Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"☆51Updated 3 weeks ago
- Preference Learning for LLaVA☆29Updated 2 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆70Updated 5 months ago
- ☆31Updated 11 months ago
- Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)☆27Updated 3 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆97Updated 2 weeks ago
- ☆31Updated 2 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆72Updated 4 months ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆112Updated 6 months ago
- ☆24Updated 5 months ago
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆62Updated 4 months ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆47Updated last month
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆29Updated last year
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆62Updated 7 months ago
- Language Quantized AutoEncoders☆95Updated last year
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆43Updated 2 weeks ago
- [ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…☆12Updated 4 months ago
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆37Updated last year
- https://arxiv.org/abs/2209.15162☆48Updated last year
- ☆39Updated 5 months ago
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆32Updated last year
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆21Updated last month
- Implementation of Bitune: Bidirectional Instruction-Tuning☆16Updated 7 months ago
- ☆37Updated 2 months ago
- ☆43Updated 5 months ago
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"☆19Updated 3 months ago
- ☆47Updated last year
- visual question answering prompting recipes for large vision-language models☆23Updated 4 months ago