AI45Lab/VLSBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AI45Lab/VLSBench)

AI45Lab / VLSBench

[ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety

☆62

Alternatives and similar repositories for VLSBench

Users that are interested in VLSBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AI45Lab / DEAN
View on GitHub
☆11Oct 25, 2024Updated last year
DripNowhy / ETA
View on GitHub
[ICLR 2025] PyTorch Implementation of "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time"
☆34Jul 20, 2025Updated last year
EchoseChen / SPA-VL-RLHF
View on GitHub
The reinforcement learning codes for dataset SPA-VL
☆48Jun 24, 2024Updated 2 years ago
AI45Lab / DeepSafe
View on GitHub
All-in-One Safety Evaluation Framwork
☆51Updated this week
OpenSafetyLab / SALAD-BENCH
View on GitHub
【ACL 2024】 SALAD benchmark & MD-Judge
☆176Mar 8, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
CryptoAILab / FigStep
View on GitHub
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
☆212Jun 26, 2025Updated last year
AI45Lab / MLLMGuard
View on GitHub
☆46Jun 19, 2025Updated last year
AI45Lab / ActorAttack
View on GitHub
☆135Jun 29, 2026Updated 3 weeks ago
ys-zong / VLGuard
View on GitHub
[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.
☆90Jan 19, 2025Updated last year
AI45Lab / REEF
View on GitHub
The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…
☆79Jan 16, 2025Updated last year
AI45Lab / CodeAttack
View on GitHub
[ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion
☆61Oct 1, 2025Updated 9 months ago
Dtc7w3PQ / Visco-Attack
View on GitHub
Official implementation of Visco-Attack (EMNLP 2025 Main). An open-source one-click reproduction script is also provided.
☆30Apr 11, 2026Updated 3 months ago
UCSB-AI / MSSBench
View on GitHub
[ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"
☆36Jun 23, 2025Updated last year
yjyddq / RiOSWorld
View on GitHub
[NeurIPS 2025] Official repository of RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents
☆123Dec 2, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
erfanshayegani / Jailbreak-In-Pieces
View on GitHub
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…
☆93Jun 6, 2024Updated 2 years ago
AI4Good24 / PsySafe
View on GitHub
☆53Feb 8, 2025Updated last year
X-PLUG / ToolCUA
View on GitHub
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
☆58May 13, 2026Updated 2 months ago
AI45Lab / X-Boundary
View on GitHub
[EMNLP 2025] The code repo of paper "X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Com…
☆41Nov 24, 2025Updated 7 months ago
Linn3a / siren
View on GitHub
Official implementation of Selective Entropy Regularization (SIREN), proposed by paper 'Rethinking Entropy Regularization in Large Reason…
☆32Dec 10, 2025Updated 7 months ago
isXinLiu / MM-SafetyBench
View on GitHub
Accepted by ECCV 2024
☆218Oct 15, 2024Updated last year
Dtc7w3PQ / Response-Attack
View on GitHub
Official implementation of “Response Attack: Exploiting Contextual Priming to Jailbreak Large Language Models” (AAAI 2026).
☆37Mar 22, 2026Updated 3 months ago
ChnQ / TracingLLM
View on GitHub
☆30May 22, 2024Updated 2 years ago
pipilurj / MLLM-protector
View on GitHub
The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"
☆46Apr 21, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
itsvaibhav01 / Immune
View on GitHub
[CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
☆28Jun 11, 2025Updated last year
ZichenWen1 / DIJA
View on GitHub
(ICLR 2026 🔥) Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"
☆79Feb 9, 2026Updated 5 months ago
eth-sri / privacy-inference-multimodal
View on GitHub
☆21Feb 3, 2025Updated last year
ShaoShuai0605 / Misevolution
View on GitHub
Official Repo of Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents
☆90Jun 2, 2026Updated last month
Nebularaid2000 / rethink_sft_generalization
View on GitHub
Repo for paper "Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability"
☆108Apr 23, 2026Updated 2 months ago
AI45Lab / IS-Bench
View on GitHub
[AAAI 2026] Data and Code for Paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
☆47Nov 24, 2025Updated 7 months ago
MaTengSYSU / HIMRD-jailbreak
View on GitHub
Code repository for the paper "Heuristic Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models"
☆19Aug 7, 2025Updated 11 months ago
jamessealesmith / ConStruct-VL
View on GitHub
PyTorch code for the CVPR'23 paper: "ConStruct-VL: Data-Free Continual Structured VL Concepts Learning"
☆13Feb 5, 2024Updated 2 years ago
ml-research / LlavaGuard
View on GitHub
☆70Sep 30, 2025Updated 9 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
zhangzaibin / AD-H
View on GitHub
☆15May 21, 2026Updated last month
SaFo-Lab / JailBreakV_28K
View on GitHub
[COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…
☆96May 9, 2025Updated last year
yjyddq / EOSER-ASS-RL
View on GitHub
Official Repository of "Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Ste…
☆28Mar 9, 2026Updated 4 months ago
ChnQ / MI-Peaks
View on GitHub
☆68Jul 14, 2025Updated last year
adversarial-for-goodness / Co-Attack
View on GitHub
official PyTorch implement of Towards Adversarial Attack on Vision-Language Pre-training Models
☆69Mar 20, 2023Updated 3 years ago
YitingQu / unsafe-diffusion
View on GitHub
☆50Jul 14, 2024Updated 2 years ago
git-disl / awesome_LLM-harmful-fine-tuning-papers
View on GitHub
A survey on harmful fine-tuning attack for large language model (ACM CSUR)
☆247Jun 22, 2026Updated 3 weeks ago