This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
☆38Mar 9, 2025Updated last year
Alternatives and similar repositories for ml-mia-bench
Users that are interested in ml-mia-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆20May 15, 2025Updated 10 months ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated last year
- [ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents☆49Feb 2, 2026Updated last month
- ☆31Sep 12, 2025Updated 6 months ago
- ☆28Oct 28, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.☆27Aug 2, 2024Updated last year
- Repository of IPBench☆20Jan 4, 2026Updated 2 months ago
- [ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Following☆119Feb 13, 2026Updated last month
- Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation☆27Dec 12, 2025Updated 3 months ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 8 months ago
- ☆17Oct 24, 2020Updated 5 years ago
- Clinical NLP concept extraction of ADEs in the 2018 n2c2 Adverse Drug Events and Medication Extraction (Track 2). Includes data preproce…☆16Nov 21, 2020Updated 5 years ago
- A simple template for theoretical computer science assignments☆11Sep 6, 2023Updated 2 years ago
- SimKO: Simple Pass@K Policy Optimization☆28Oct 24, 2025Updated 5 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- GPT Demo with hybrid distributed training☆10Dec 1, 2022Updated 3 years ago
- A dataset of scientific vector graphics in TikZ for training generative models.☆25Feb 4, 2026Updated last month
- This repo explain how qr codes works, qr detection and decoding.☆68Nov 22, 2022Updated 3 years ago
- ☆19Nov 28, 2020Updated 5 years ago
- Learning Safety Constraints for Large Language Models (ICML2025)☆33Aug 4, 2025Updated 7 months ago
- Source code for the ACL 2018 paper: "A Walk-based model on Entity Graphs for Relation Extraction"☆13Jul 25, 2024Updated last year
- [ICLR'25 Oral] MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models☆36Nov 3, 2024Updated last year
- ☆17Feb 22, 2024Updated 2 years ago
- Code repository for the paper - "Neural Priming for Sample-Efficient Adaptation"☆14Nov 13, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆31Feb 26, 2026Updated last month
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)☆15Feb 1, 2025Updated last year
- Multimodal RewardBench☆64Feb 21, 2025Updated last year
- [NeurIPS 2025] L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models☆24Oct 29, 2025Updated 4 months ago
- [TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".☆10Aug 14, 2024Updated last year
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆64May 15, 2025Updated 10 months ago
- Evaluate gpt-4o on CLIcK (Korean NLP Dataset)☆20May 18, 2024Updated last year
- An up-to-date list of works on Multi-domain Multi-task learning☆18Oct 20, 2022Updated 3 years ago
- ☆19Oct 28, 2025Updated 4 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness☆14Nov 10, 2025Updated 4 months ago
- ☆16May 27, 2024Updated last year
- A project for tri-modal LLM benchmarking and instruction tuning.☆56Mar 27, 2025Updated last year
- ☆18Jun 3, 2024Updated last year
- 强化学习课程,主要是如何用强化学习解决问题☆15Dec 10, 2024Updated last year
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆81Sep 13, 2024Updated last year
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆38Jan 25, 2024Updated 2 years ago