thu-ml/MMTrustEval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/thu-ml/MMTrustEval)

thu-ml / MMTrustEval

A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)

☆177

Alternatives and similar repositories for MMTrustEval

Users that are interested in MMTrustEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Aries-iai / Cross-modal_Patch_Attack
View on GitHub
Unified Adversarial Patch for Cross-modal Attacks in the Physical World (ICCV, 2023)
☆45Dec 15, 2023Updated 2 years ago
thu-ml / STAIR
View on GitHub
Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"
☆89Feb 26, 2025Updated last year
CLIAgroup / ANDA
View on GitHub
[CVPR2024 Highlight] Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning
☆19Jun 14, 2024Updated 2 years ago
thu-ml / Attack-Bard
View on GitHub
☆108Feb 16, 2024Updated 2 years ago
isXinLiu / MM-SafetyBench
View on GitHub
Accepted by ECCV 2024
☆218Oct 15, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Aries-iai / TT3D
View on GitHub
The official implementation for "Towards Transferable Targeted 3D Adversarial Attack in the Physical World" (CVPR, 2024))
☆43Aug 6, 2024Updated last year
shighghyujie / infrared_patch_attack
View on GitHub
Physically Adversarial Infrared Patches with Learnable Shapes and Locations
☆42Aug 13, 2023Updated 2 years ago
fangjf1 / OpenSafeMLRM
View on GitHub
The first toolkit for MLRM safety evaluation, providing unified interface for mainstream models, datasets, and jailbreaking methods!
☆15Apr 8, 2025Updated last year
thunxxx / MLLM-Jailbreak-evaluation-MMJ-Bench
View on GitHub
☆81Mar 30, 2025Updated last year
xirui-li / MOSSBench
View on GitHub
An implementation for MLLM oversensitivity evaluation
☆18Nov 16, 2024Updated last year
isXinLiu / Awesome-MLLM-Safety
View on GitHub
Accepted by IJCAI-24 Survey Track
☆233Aug 25, 2024Updated last year
abc03570128 / Jailbreaking-Attack-against-Multimodal-Large-Language-Model
View on GitHub
☆63Aug 11, 2024Updated last year
zycheiheihei / Transferable-Visual-Prompting
View on GitHub
[CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…
☆45Dec 20, 2024Updated last year
yuki-younai / Jailbreak-R1
View on GitHub
offical implementation of Jailbreak-R1
☆15Jul 16, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
yibo-miao / T2VSafetyBench
View on GitHub
☆28Nov 4, 2024Updated last year
RUCAIBox / HADES
View on GitHub
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆39Oct 23, 2024Updated last year
CryptoAILab / FigStep
View on GitHub
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
☆211Jun 26, 2025Updated last year
UCSB-AI / MSSBench
View on GitHub
[ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"
☆36Jun 23, 2025Updated last year
Unispac / Visual-Adversarial-Examples-Jailbreak-Large-Language-Models
View on GitHub
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
☆281May 13, 2024Updated 2 years ago
erfanshayegani / Jailbreak-In-Pieces
View on GitHub
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…
☆93Jun 6, 2024Updated 2 years ago
lorraine021 / Awesome-MLLM-Hallucination
View on GitHub
A comprehensive collection of resources focused on addressing and understanding hallucination phenomena in MLLMs.
☆35May 7, 2024Updated 2 years ago
gyhdog99 / ECSO
View on GitHub
ECSO (Make MLLM safe without neither training nor any external models!) (https://arxiv.org/abs/2403.09572)
☆37Nov 2, 2024Updated last year
Zoky-2020 / Security_and_Privacy_in_AIGC
View on GitHub
A list of research towards security&privacy in AI-Generated Content
☆17Jan 10, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
AI45Lab / MLLMGuard
View on GitHub
☆46Jun 19, 2025Updated last year
NY1024 / BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt
View on GitHub
☆61Jun 5, 2024Updated 2 years ago
HashmatShadab / HSAT
View on GitHub
[MICCAI 2025] Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology
☆12Jun 17, 2025Updated last year
Alibaba-AAIG / Oyster
View on GitHub
The Oyster series is a set of safety models developed in-house by Alibaba-AAIG, devoted to building a responsible AI ecosystem. | Oyster …
☆62Apr 29, 2026Updated 2 months ago
persistz / attack-as-defense
View on GitHub
Code for ISSTA'21 paper 'Attack as Defense: Characterizing Adversarial Examples using Robustness'.
☆11Sep 4, 2021Updated 4 years ago
roywang021 / UMK
View on GitHub
Code for ACM MM2024 paper: White-box Multimodal Jailbreaks Against Large Vision-Language Models
☆34Dec 30, 2024Updated last year
AoiDragon / HADES
View on GitHub
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆41Oct 17, 2024Updated last year
FenHua / transfer_adv
View on GitHub
transfer attack; adversarial examples; black-box attack; unrestricted Adversarial Attacks on ImageNet; CVPR2021 天池黑盒竞赛
☆24Oct 24, 2021Updated 4 years ago
liudaizong / Awesome-LVLM-Attack
View on GitHub
😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.
☆567Jul 15, 2026Updated last week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
huanranchen / AdversarialAttacks
View on GitHub
☆83Jul 23, 2024Updated 2 years ago
Heathcliff-saku / ViewFool_
View on GitHub
This repository contains the ViewFool and ImageNet-V proposed by the paper “ViewFool: Evaluating the Robustness of Visual Recognition to …
☆33Dec 18, 2023Updated 2 years ago
P2333 / Rectified-Rejection
View on GitHub
Coupling rejection strategy against adversarial attacks (CVPR 2022)
☆29Mar 2, 2022Updated 4 years ago
ShawnXYang / C-GSP
View on GitHub
☆16Jul 25, 2022Updated 4 years ago
FreedomIntelligence / TRIM
View on GitHub
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆22Jan 11, 2026Updated 6 months ago
ShawnXYang / Face-Robustness-Benchmark
View on GitHub
An adversarial robustness evaluation library on face recognition.
☆114Jul 6, 2023Updated 3 years ago
aiPenguin / StopReasoning
View on GitHub
☆15Oct 6, 2024Updated last year