hxhcreate / VLSBench

Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety

☆25

Alternatives and similar repositories for VLSBench:

Users that are interested in VLSBench are comparing it to the libraries listed below

git-disl / awesome_LLM-harmful-fine-tuning-papers
A survey on harmful fine-tuning attack for large language model
☆105Updated last week
isXinLiu / MM-SafetyBench
Accepted by ECCV 2024
☆81Updated 2 months ago
AI4Good24 / PsySafe
☆34Updated 2 weeks ago
ydyjya / LLM-IHS-Explanation
☆36Updated 6 months ago
umd-huang-lab / VLM-Poisoning
Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
☆34Updated 2 months ago
erfanshayegani / Jailbreak-In-Pieces
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…
☆30Updated 6 months ago
UCSC-VLAA / vllm-safety-benchmark
[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"
☆72Updated last year
ChenWu98 / agent-attack
[Arxiv 2024] Adversarial attacks on multimodal agents
☆44Updated 5 months ago
abc03570128 / Jailbreaking-Attack-against-Multimodal-Large-Language-Model
☆39Updated 4 months ago
SaFoLab-WISC / AdaShield
[ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…
☆48Updated 5 months ago
thu-ml / MMTrustEval
A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)
☆111Updated last month
OpenSafetyLab / SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
☆110Updated 2 weeks ago
yunqing-me / AttackVLM
[NeurIPS-2023] Annual Conference on Neural Information Processing Systems
☆165Updated last year
boyiwei / alignment-attribution-code
Official Code for Paper: Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
☆62Updated 2 months ago
liudaizong / Awesome-LVLM-Attack
😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.
☆155Updated this week
OPTML-Group / SOUL
Official repo for EMNLP'24 paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"
☆14Updated 2 months ago
NY1024 / Foundation-Model-Paper-Notes
☆25Updated 5 months ago
huanranchen / VLMTransfer
A package that achieves 95%+ transfer attack success rate against GPT-4
☆15Updated last month
renqibing / ActorAttack
☆57Updated last month
yuplin2333 / representation-space-jailbreak
Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…
☆14Updated 4 months ago
neelsjain / baseline-defenses
Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"
☆19Updated last year
Unispac / shallow-vs-deep-alignment
Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep
☆44Updated 5 months ago
thunxxx / MLLM-Jailbreak-evaluation-MMJ-Bench
☆25Updated 2 weeks ago
renqibing / CodeAttack
[ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion
☆30Updated last month
inspire-group / RobustRAG
☆10Updated 3 months ago
sail-sg / closer-look-LLM-unlearning
The official code of the paper "A Closer Look at Machine Unlearning for Large Language Models".
☆18Updated last week
chujiezheng / LLM-Safeguard
Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"
☆78Updated 3 months ago
Improbable-AI / curiosity_redteam
Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizX…
☆66Updated 9 months ago
tmylla / REEF
The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…
☆26Updated last month
EddyLuo1232 / JailBreakV_28K
JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and further assess …
☆38Updated 5 months ago