shuaizhao95 / Prompt_attackLinks

☆0

Alternatives and similar repositories for Prompt_attack

Users that are interested in Prompt_attack are comparing it to the libraries listed below

Sorting:

mireshghallah / neighborhood-curvature-mia
☆21Updated last year
ThuCCSLab / MergeGuard
[CCS-LAMPS'24] LLM IP Protection Against Model Merging
☆15Updated 9 months ago
XuandongZhao / DRW
[EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP
☆13Updated last year
reds-lab / ASSET
This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…
☆19Updated 2 years ago
Jayfeather1024 / Backdoor-Enhanced-Alignment
☆22Updated 8 months ago
ethz-spylab / rlhf-poisoning
Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"
☆57Updated last year
papersPapers / BadPrompt
Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"
☆39Updated last year
yxoh / prompt_leak_usenix2024
☆13Updated last year
tianshuocong / TePA
[S&P'24] Test-Time Poisoning Attacks Against Test-Time Adaptation Models
☆18Updated 5 months ago
DennisLiu2022 / Membership-Inference-Attacks-by-Exploiting-Loss-Trajectory
☆24Updated 2 years ago
AISafety-HKUST / Backdoor_Safety_Tuning
Backdoor Safety Tuning (NeurIPS 2023 & 2024 Spotlight)
☆26Updated 8 months ago
Vaidehi99 / InfoDeletionAttacks
☆44Updated 6 months ago
bboylyg / RNP
Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023)
☆39Updated last year
weichen-yu / LM-Extraction
☆44Updated 2 years ago
grasses / PoisonPrompt
Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107
☆17Updated 11 months ago
yfchen1994 / poisoning_membership
☆20Updated last year
reds-lab / Meta-Sift
The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on …
☆19Updated 2 years ago
RJ-T / NIPS2022_EP_BNP
Official Implementation of NIPS 2022 paper Pre-activation Distributions Expose Backdoor Neurons
☆14Updated 2 years ago
shiningrain / JailGuard
☆17Updated 4 months ago
Sanghyun-Hong / Gradient-Shaping
[Preprint] On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping
☆10Updated 5 years ago
zaixizhang / CBD
Official Inplementation of CVPR23 paper "Backdoor Defense via Deconfounded Representation Learning"
☆26Updated 2 years ago
wagner-group / MarkMyWords
☆30Updated last year
PurduePAML / PICCOLO
☆25Updated 2 years ago
David-Li0406 / AI-Supervision-Risk
☆21Updated 4 months ago
lancopku / SOS
Code for the paper "Rethinking Stealthiness of Backdoor Attack against NLP Models" (ACL-IJCNLP 2021)
☆24Updated 3 years ago
TDteach / Demon-in-the-Variant
☆13Updated 3 years ago
Princeton-SysML / FILM
Official repo for the paper: Recovering Private Text in Federated Learning of Language Models (in NeurIPS 2022)
☆58Updated 2 years ago
lvpeizhuo / MEA-Defender
This is the source code for MEA-Defender. Our paper is accepted by the IEEE Symposium on Security and Privacy (S&P) 2024.
☆25Updated last year
YuYang0901 / EPIC
Not All Poisons are Created Equal: Robust Training against Data Poisoning (ICML 2022)
☆20Updated 3 years ago
jthickstun / watermark
Code for watermarking language models
☆80Updated 11 months ago