ChengshuaiZhao0 / The-Wolf-WithinLinks

☆12

Alternatives and similar repositories for The-Wolf-Within

Users that are interested in The-Wolf-Within are comparing it to the libraries listed below

Sorting:

sail-sg / AnyDoor
AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models
☆60Updated last year
YitingQu / unsafe-diffusion
☆40Updated last year
UCSC-VLAA / vllm-safety-benchmark
[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"
☆84Updated 2 years ago
ys-zong / VLGuard
[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.
☆81Updated 10 months ago
pipilurj / MLLM-protector
The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"
☆44Updated last year
umd-huang-lab / VLM-Poisoning
Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
☆57Updated 10 months ago
clear-nus / selective-amnesia
☆65Updated last year
SaFo-Lab / FIUBench
A Task of Fictitious Unlearning for VLMs
☆26Updated 8 months ago
sail-sg / Meta-Unlearning
☆33Updated 7 months ago
VITA-Group / Shake-to-Leak
[SatML 2024] Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk
☆15Updated 8 months ago
PKU-ML / DYNACL
[ICLR 2023] Official repository of the paper "Rethinking the Effect of Data Augmentation in Adversarial Contrastive Learning"
☆18Updated 2 years ago
SaFoLab-WISC / AdaShield
[ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…
☆67Updated last year
haonan3 / ICML-2024-Oral-SilentBadDiffusion
☆13Updated last year
cvlab-columbia / ZSRobust4FoundationModel
☆43Updated 2 years ago
TreeLLi / APT
One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models
☆56Updated 11 months ago
itsvaibhav01 / Immune
[CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
☆25Updated 5 months ago
OPTML-Group / Diffusion-MU-Attack
The official implementation of ECCV'24 paper "To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Uns…
☆87Updated 9 months ago
erfanshayegani / Jailbreak-In-Pieces
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…
☆78Updated last year
UCSB-NLP-Chang / SelfDenoise
☆14Updated last year
fangjf1 / OpenSafeMLRM
The first toolkit for MLRM safety evaluation, providing unified interface for mainstream models, datasets, and jailbreaking methods!
☆14Updated 8 months ago
chiayi-hsu / Ring-A-Bell
☆38Updated 10 months ago
yunqing-me / AttackVLM
[NeurIPS-2023] Annual Conference on Neural Information Processing Systems
☆222Updated 11 months ago
Jayfeather1024 / DensePure
☆19Updated 2 years ago
YukeHu / vlm_mia
Code for paper "Membership Inference Attacks Against Vision-Language Models"
☆24Updated 10 months ago
20000yshust / SWARM
[CVPR 2024] Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transfomers
☆16Updated last year
SaFo-Lab / Awesome-T2I-safety-Papers
List of T2I safety papers, updated daily, welcome to discuss using Discussions
☆67Updated last year
xirui-li / MOSSBench
An implementation for MLLM oversensitivity evaluation
☆17Updated last year
nishadsinghi / CleanCLIP
Official PyTorch implementation of "CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning" @ ICCV 2023
☆39Updated last month
AoiDragon / HADES
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆36Updated last year
thu-ml / Attack-Bard
☆107Updated last year