apple/ml-mia-bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/apple/ml-mia-bench)

apple / ml-mia-bench

This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs

☆38

Alternatives and similar repositories for ml-mia-bench

Users that are interested in ml-mia-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RainBowLuoCS / MMEvol
View on GitHub
(ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"
☆22May 15, 2025Updated last year
MozerWang / promISe
View on GitHub
[COLING 2024 (Oral)] PromISe:Releasing the Capabilities of LLMs with Prompt Introspective Search
☆23Aug 26, 2024Updated last year
RainBowLuoCS / DEEM
View on GitHub
(ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.
☆51Jul 1, 2025Updated last year
IPBench / IPBench
View on GitHub
[ACL 2026] Repository of IPBench
☆23Apr 6, 2026Updated 3 months ago
snap-research / VIMI
View on GitHub
☆13Jul 10, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
II-Bench / II-Bench
View on GitHub
☆28Oct 28, 2024Updated last year
October2001 / ProLong
View on GitHub
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
☆61Jul 23, 2024Updated 2 years ago
MozerWang / AMPO
View on GitHub
[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents
☆51Feb 2, 2026Updated 5 months ago
SYuan03 / MM-IFEngine
View on GitHub
[ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Following
☆126Feb 13, 2026Updated 5 months ago
TerminologyHub / termhub-in-5-minutes
View on GitHub
Developer project for getting basic API integrations working in under 5 minutes
☆11May 22, 2026Updated 2 months ago
BatsResearch / fudd
View on GitHub
Follow-Up Differential Descriptions: Language Models Resolve Ambiguities for Image Classification
☆11Nov 15, 2023Updated 2 years ago
hpcaitech / GPT-Demo
View on GitHub
GPT Demo with hybrid distributed training
☆10Dec 1, 2022Updated 3 years ago
Lillianwei-h / MMIE
View on GitHub
[ICLR'25 Oral] MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
☆35Nov 3, 2024Updated last year
ydc123 / MMP-Attack
View on GitHub
Official repository for "On the Multi-modal Vulnerability of Diffusion Models"
☆17Jul 15, 2024Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
RAIVNLab / neural-priming
View on GitHub
Code repository for the paper - "Neural Priming for Sample-Efficient Adaptation"
☆14Nov 13, 2023Updated 2 years ago
core-mm / core-mm
View on GitHub
☆17Feb 22, 2024Updated 2 years ago
si0wang / ViCrit
View on GitHub
☆24Jun 18, 2025Updated last year
OoDBag / VisTA
View on GitHub
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection
☆27May 31, 2025Updated last year
Leezekun / MMSci
View on GitHub
MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension
☆51Dec 3, 2024Updated last year
apple / ml-epicache
View on GitHub
☆30Oct 2, 2025Updated 9 months ago
zchoi / SPT
View on GitHub
[TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".
☆10Aug 14, 2024Updated last year
Lyun0912-wu / LongAttn
View on GitHub
LongAttn ：Selecting Long-context Training Data via Token-level Attention
☆15Jul 16, 2025Updated last year
Aurora-slz / MM-Verify
View on GitHub
☆19Oct 28, 2025Updated 8 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
holarissun / RewardModelingBeyondBradleyTerry
View on GitHub
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…
☆73Apr 2, 2025Updated last year
salomonhotegni / MDMTN
View on GitHub
[IJCNN 2024] Multi-Objective Optimization for Sparse Deep Multi-Task Learning
☆16May 22, 2025Updated last year
mdabbah / COOD_benchmarking
View on GitHub
☆16May 27, 2024Updated 2 years ago
gl-ybnbxb / BoNBoN
View on GitHub
☆19Jun 3, 2024Updated 2 years ago
RaphaelOlivier / whisper_attack
View on GitHub
☆23Apr 3, 2025Updated last year
SalesforceAIResearch / indict_code_gen
View on GitHub
INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
☆15Jun 2, 2026Updated last month
xiaoboxia / RTM_LNL
View on GitHub
Regularly Truncated M-estimators for Learning with Noisy Labels
☆11Apr 24, 2024Updated 2 years ago
prometheus-eval / prometheus-vision
View on GitHub
[ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…
☆86Sep 13, 2024Updated last year
RenlyH / CodeV
View on GitHub
[CVPR 2026 Oral] Code with Image
☆31Dec 5, 2025Updated 7 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wqshmzh / CANet-CZSL
View on GitHub
Official pytorch implementation of CVPR2023 paper "Learning Conditional Attributes for Compositional Zero-Shot Learning"
☆18Oct 19, 2025Updated 9 months ago
jimmyli08 / DARNet-CD
View on GitHub
The source code of DARNet with Pytorch implementation； Remote sensing change detection
☆16Jun 19, 2022Updated 4 years ago
CyanScholar / CloudNativeSim
View on GitHub
A toolkit for modeling and simulation of cloud-native applications.
☆16Aug 4, 2025Updated 11 months ago
MozerWang / Loong
View on GitHub
[EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
☆155Dec 22, 2025Updated 7 months ago
zhenyuhe00 / BiPE
View on GitHub
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024
☆24Jun 26, 2024Updated 2 years ago
oss-roettger / T5-Textual-Inversion
View on GitHub
Textual Inversion for DeepFloyd IF
☆61Sep 19, 2023Updated 2 years ago
maitrix-org / dynamic-alignment-optimization
View on GitHub
[EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…
☆24Nov 17, 2024Updated last year