flageval-baai/FlagEvalMM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/flageval-baai/FlagEvalMM)

flageval-baai / FlagEvalMM

A Flexible Framework for Comprehensive Multimodal Model Evaluation

☆107

Alternatives and similar repositories for FlagEvalMM

Users that are interested in FlagEvalMM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MaTengSYSU / HIMRD-jailbreak
View on GitHub
Code repository for the paper "Heuristic Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models"
☆19Aug 7, 2025Updated 11 months ago
Li-ChangHao / CoNav
View on GitHub
☆12Jul 16, 2024Updated 2 years ago
ZGC-EmbodyAI / TwinBrainVLA
View on GitHub
☆29May 22, 2026Updated last month
MME-Benchmarks / MME-Unify
View on GitHub
✨✨ [ICLR 2026] MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
☆42Apr 10, 2025Updated last year
dongsenzhang / MSB
View on GitHub
☆38Mar 24, 2026Updated 3 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
NJUNLP / Hallu-PI
View on GitHub
The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …
☆11Sep 27, 2024Updated last year
dochouyi / SUCC
View on GitHub
☆11May 9, 2024Updated 2 years ago
Elvin-Yiming-Du / Memory-T1
View on GitHub
This respository is used for time reasoning task for mult-session dialogue system.
☆16Feb 7, 2026Updated 5 months ago
Aurora-slz / MM-Verify
View on GitHub
☆19Oct 28, 2025Updated 8 months ago
IVY-LVLM / CODE
View on GitHub
Official Implementation of CODE
☆17Sep 26, 2024Updated last year
jihaonew / MM-Instruct
View on GitHub
MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment
☆35Jul 1, 2024Updated 2 years ago
FlagOpen / RoboBrain2.5
View on GitHub
RoboBrain 2.5: Advanced version of RoboBrain. Depth in Sight, Time in Mind. 🎉🎉🎉
☆1,115Feb 28, 2026Updated 4 months ago
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
JCruan519 / GIST
View on GitHub
(ACM MM24) This is the offical repository of GIST: Improving Parameter Efficient Fine Tuning via Knowledge Interaction.
☆11Jan 28, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
dw-dengwei / TreeSearchGen
View on GitHub
[CVPR 2025🔥] Official codebase for "Global-Local Tree Search in VLMs for 3D Indoor Scene Generation" and our arxiv 2026 extension
☆22Jun 5, 2026Updated last month
0xWelt / VibeRL
View on GitHub
VibeRL is a Reinforcement Learning framework built essentially through vibe coding with Kimi K2.
☆17Jul 13, 2026Updated last week
ZSHsh98 / EPS-AD
View on GitHub
This is the source code for Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score (ICML2023).
☆41Oct 15, 2024Updated last year
BZX667 / DMPO
View on GitHub
☆12Jun 19, 2024Updated 2 years ago
OpenEnvision / Awesome-Visual-Agent
View on GitHub
Awesome Visual Agent
☆19Jul 1, 2026Updated 2 weeks ago
liuxuannan / Awesome-Multimodal-Jailbreak
View on GitHub
A Survey on Jailbreak Attacks and Defenses against Multimodal Generative Models
☆332Jan 11, 2026Updated 6 months ago
liyucheng09 / llm-compressive
View on GitHub
Longitudinal Evaluation of LLMs via Data Compression
☆32May 29, 2024Updated 2 years ago
TerryPei / CSP
View on GitHub
Cross-Self KV Cache Pruning for Efficient Vision-Language Inference
☆10Dec 15, 2024Updated last year
fangjf1 / OpenSafeMLRM
View on GitHub
The first toolkit for MLRM safety evaluation, providing unified interface for mainstream models, datasets, and jailbreaking methods!
☆15Apr 8, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
dunzeng / MORE
View on GitHub
Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment
☆16Aug 6, 2024Updated last year
coolbutuseless / triangular
View on GitHub
Decompose complex polygons into sets of triangles
☆10Oct 4, 2020Updated 5 years ago
Bruce-Lee-LY / cutlass_gemm
View on GitHub
Multiple GEMM operators are constructed with cutlass to support LLM inference.
☆20Aug 3, 2025Updated 11 months ago
ZijunLi7 / ConDo
View on GitHub
A public repository for ConDo (AAAI25 accepted)
☆10Dec 21, 2024Updated last year
Kyyle2114 / Convolutional-Adapter-for-Segment-Anything
View on GitHub
CAD - Memory Efficient Convolutional Adapter for Segment Anything
☆12Oct 4, 2024Updated last year
AI45Lab / VLSBench
View on GitHub
[ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety
☆62Jul 21, 2025Updated last year
lizhaoliu-Lec / ImageAestheticAssessmentPyTorch
View on GitHub
Image Aesthetic Assessment in PyTorch with implemented popular datasets and models (possibly providing the pretrained ones).
☆41Sep 7, 2022Updated 3 years ago
sled-group / moh
View on GitHub
[NeurIPS 2024] Official Repository of Multi-Object Hallucination in Vision-Language Models
☆37Nov 13, 2024Updated last year
liuxuannan / FAK-Owl
View on GitHub
[ACM MM 2024] FKA-Owl: Advancing Multimodal Fake News Detection through Knowledge-Augmented LVLMs
☆59Aug 8, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Quhaoh233 / TokenRec
View on GitHub
[IEEE TKDE] A LLM-based Recommender System with user&item Tokenizers and a generative retrieval paradigm.
☆31Mar 11, 2026Updated 4 months ago
xhan77 / veiled-toxicity-detection
View on GitHub
Fortifying Toxic Speech Detectors Against Veiled Toxicity
☆11Oct 21, 2020Updated 5 years ago
swordlidev / Evaluation-Multimodal-LLMs-Survey
View on GitHub
A Survey on Benchmarks of Multimodal Large Language Models
☆156Jul 13, 2026Updated last week
Sherrylife / FedLMT
View on GitHub
[ICML2024] "FedLMT: Tackling System Heterogeneity of Federated Learning via Low-Rank Model Training with Theoretical Guarantees" by Jiaha…
☆14Sep 22, 2024Updated last year
FlagOpen / RoboBrain-X0
View on GitHub
☆117Oct 27, 2025Updated 8 months ago
isle-dev / MetricEval
View on GitHub
MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…
☆12Nov 6, 2023Updated 2 years ago
chenzen94 / debug-deepspeed-chat
View on GitHub
Debug DeepSpeed-Chat step by step in IDE (在IDE里一步一步调试DeepSpeed-Chat)
☆10Apr 17, 2023Updated 3 years ago