large-model-safety / large-model-safety.github.io

☆10

Alternatives and similar repositories for large-model-safety.github.io:

Users that are interested in large-model-safety.github.io are comparing it to the libraries listed below

ai-data-model-safety / ai-data-model-safety.github.io
☆26Updated last month
NY1024 / Foundation-Model-Paper-Notes
☆33Updated last month
roywang021 / UMK
Code for ACM MM2024 paper: White-box Multimodal Jailbreaks Against Large Vision-Language Models
☆21Updated last month
RUCAIBox / HADES
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆16Updated 3 months ago
ledllm / ledllm
☆16Updated 7 months ago
jiawangbai / BadCLIP
Implementation of BadCLIP https://arxiv.org/pdf/2311.16194.pdf
☆18Updated 10 months ago
NY1024 / BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt
☆31Updated 7 months ago
huanranchen / VLMTransfer
A package that achieves 95%+ transfer attack success rate against GPT-4
☆17Updated 3 months ago
huanranchen / AdversarialAttacks
☆64Updated 6 months ago
ydc123 / MMP-Attack
Official repository for "On the Multi-modal Vulnerability of Diffusion Models"
☆14Updated 6 months ago
ffhibnese / CPGC_VLP_Universal_Attacks
Universal Adversarial Attack, Multimodal Adversarial Attacks, VLP models, Contrastive Learning, Cross-modal Perturbation Generator, Gener…
☆13Updated 3 months ago
erfanshayegani / Jailbreak-In-Pieces
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…
☆37Updated 7 months ago
GuanlinLee / ART
Official Code for ART: Automatic Red-teaming for Text-to-Image Models to Protect Benign Users (NeurIPS 2024)
☆12Updated 3 months ago
whdii / TMM
☆12Updated last year
LiangSiyuan21 / BadCLIP
☆20Updated 4 months ago
jiaxiaojunQAQ / FP-Better
Code for Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks (TIFS2024)
☆12Updated 10 months ago
ericyinyzy / VLAttack
This is an official repository of ``VLAttack: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models'' (NeurIPS 2…
☆41Updated 3 months ago
ybwang119 / label_recovery
[ICLR 2024] Towards Elminating Hard Label Constraints in Gradient Inverision Attacks
☆13Updated 11 months ago
ZhentingWang / LatentTracer
☆24Updated 6 months ago
researchcode001 / Divide-and-Conquer-Attack
Divide-and-Conquer Attack: Harnessing the Power of LLM to Bypass the Censorship of Text-to-Image Generation Mode
☆18Updated 4 months ago
RU-System-Software-and-Security / BppAttack
☆18Updated 2 years ago
YitingQu / unsafe-diffusion
☆28Updated 6 months ago
adversarial-for-goodness / Co-Attack
official PyTorch implement of Towards Adversarial Attack on Vision-Language Pre-training Models
☆53Updated last year
DPamK / BadAgent
☆15Updated 2 weeks ago
Lyz1213 / BadEdit
☆21Updated 3 months ago
CGCL-codes / TeCo
[CVPR 2023] The official implementation of our CVPR 2023 paper "Detecting Backdoors During the Inference Stage Based on Corruption Robust…
☆21Updated last year
abc03570128 / Jailbreaking-Attack-against-Multimodal-Large-Language-Model
☆40Updated 5 months ago
xlhex / NLG_api_watermark
☆9Updated 2 years ago
KuofengGao / Verbose_Images
[ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images
☆25Updated last year
JunfengGo / SCALE-UP
☆18Updated 7 months ago