mbzuai-oryx / ALM-BenchLinks
[CVPR 2025 π₯] ALM-Bench is a multilingual multi-modal diverse cultural benchmark for 100 languages across 19 categories. It assesses the next generation of LMMs on cultural inclusitivity.
β45Updated 5 months ago
Alternatives and similar repositories for ALM-Bench
Users that are interested in ALM-Bench are comparing it to the libraries listed below
Sorting:
- (WACV 2025 - Oral) Vision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, Hβ¦β83Updated 3 months ago
- Holistic evaluation of multimodal foundation modelsβ47Updated last year
- [ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Modelsβ78Updated 5 months ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.β163Updated last month
- [ICLR 2025] Video Action Differencingβ48Updated 4 months ago
- [ECCV 2024] Official Release of SILC: Improving vision language pretraining with self-distillationβ47Updated last year
- β53Updated 10 months ago
- Matryoshka Multimodal Modelsβ115Updated 9 months ago
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"β92Updated 5 months ago
- ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)β16Updated last year
- Python Library to evaluate VLM models' robustness across diverse benchmarksβ219Updated 3 weeks ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuningβ88Updated last year
- β41Updated last year
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.orβ¦β149Updated last month
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models" ICLR 2024β108Updated last year
- [CVPR 2024] KEPP: Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videosβ12Updated last year
- β35Updated last year
- Code for "Enhancing In-context Learning via Linear Probe Calibration"β36Updated last year
- Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"β38Updated 6 months ago
- Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)β127Updated 2 weeks ago
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"β33Updated last year
- β80Updated last year
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202β¦β39Updated 5 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."β48Updated last year
- Official implementation of MAIA, A Multimodal Automated Interpretability Agentβ94Updated 3 weeks ago
- β69Updated last year
- [Fully open] [Encoder-free MLLM] Vision as LoRAβ346Updated 5 months ago
- [ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuningβ155Updated 3 months ago
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl modelsβ27Updated last year
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"β49Updated 5 months ago