This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
β38Mar 9, 2025Updated last year
Alternatives and similar repositories for ml-mia-bench
Users that are interested in ml-mia-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (ACL 2025) π₯π₯π₯Code for "Empowering Multimodal Large Language Models with Evol-Instruct"β22May 15, 2025Updated last year
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.β10May 16, 2024Updated 2 years ago
- [ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modelingβ22Dec 16, 2024Updated last year
- [ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agentsβ50Feb 2, 2026Updated 3 months ago
- β13Jul 10, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- β28Oct 28, 2024Updated last year
- [ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Modelsβ60Jul 23, 2024Updated last year
- Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.β27Aug 2, 2024Updated last year
- [ACL 2026] Repository of IPBenchβ22Apr 6, 2026Updated last month
- [ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Followingβ122Feb 13, 2026Updated 3 months ago
- β46Dec 16, 2025Updated 5 months ago
- LongAttn οΌSelecting Long-context Training Data via Token-level Attentionβ15Jul 16, 2025Updated 10 months ago
- A simple template for theoretical computer science assignmentsβ12Sep 6, 2023Updated 2 years ago
- Developer project for getting basic API integrations working in under 5 minutesβ11Updated this week
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Paper list of compositional zero-shot learningβ11Jul 5, 2022Updated 3 years ago
- GPT Demo with hybrid distributed trainingβ10Dec 1, 2022Updated 3 years ago
- SimKO: Simple Pass@K Policy Optimizationβ30Oct 24, 2025Updated 7 months ago
- Follow-Up Differential Descriptions: Language Models Resolve Ambiguities for Image Classificationβ11Nov 15, 2023Updated 2 years ago
- Official repository for "On the Multi-modal Vulnerability of Diffusion Models"β16Jul 15, 2024Updated last year
- A dataset of scientific vector graphics in TikZ for training generative models.β27Feb 4, 2026Updated 3 months ago
- Learning Safety Constraints for Large Language Models (ICML2025)β34Aug 4, 2025Updated 9 months ago
- [ICLR'25 Oral] MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Modelsβ36Nov 3, 2024Updated last year
- This repo explain how qr codes works, qr detection and decoding.β69Nov 22, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- β17Feb 22, 2024Updated 2 years ago
- IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuseβ101Mar 14, 2026Updated 2 months ago
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)β15Feb 1, 2025Updated last year
- Chinese-Handwriting-Toolβ13Nov 11, 2023Updated 2 years ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, andβ¦β72Apr 2, 2025Updated last year
- [NeurIPS'25] ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Rβ¦β38Sep 27, 2025Updated 8 months ago
- [NeurIPS 2025] L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Modelsβ28May 8, 2026Updated 2 weeks ago
- Multimodal RewardBenchβ68Feb 21, 2025Updated last year
- [TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".β10Aug 14, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Evaluate gpt-4o on CLIcK (Korean NLP Dataset)β20May 18, 2024Updated 2 years ago
- An up-to-date list of works on Multi-domain Multi-task learningβ18Oct 20, 2022Updated 3 years ago
- Source code for the IJCNN2024 paper titled "Multi-Objective Optimization for Sparse Deep Multi-Task Learning"β16May 22, 2025Updated last year
- β16May 27, 2024Updated 2 years ago
- β25May 28, 2025Updated 11 months ago
- A project for tri-modal LLM benchmarking and instruction tuning.β61Mar 27, 2025Updated last year
- β18Jun 3, 2024Updated last year