lancopku / DAN

[Findings of EMNLP 2022] Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks

☆10

Alternatives and similar repositories for DAN:

Users that are interested in DAN are comparing it to the libraries listed below

lancopku / RAP
Code for the paper "RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models" (EMNLP 2021)
☆24Updated 3 years ago
weichen-yu / LM-Extraction
☆41Updated last year
IBM / model-sanitization
Codes for reproducing the results of the paper "Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness" published at IC…
☆26Updated 4 years ago
leix28 / prompt-universal-vulnerability
Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022
☆29Updated 2 years ago
lancopku / SOS
Code for the paper "Rethinking Stealthiness of Backdoor Attack against NLP Models" (ACL-IJCNLP 2021)
☆22Updated 3 years ago
papersPapers / BadPrompt
Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"
☆36Updated 6 months ago
alvinchangw / CARA_EMNLP2020
Implementation for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder (EMNLP-Findings 2020)
☆15Updated 4 years ago
xinleihe / toxic-prompt
☆20Updated last year
yfchen1994 / poisoning_membership
☆18Updated 8 months ago
VITA-Group / Random-Shuffling-BackdoorDetect
[NeurIPS 2022] "Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets" by Ruisi Cai*, Zhenyu Zh…
☆19Updated 2 years ago
TrustAIResearch / MLHospital
☆43Updated last year
TrustAIRLab / VoiceJailbreakAttack
Code for Voice Jailbreak Attacks Against GPT-4o.
☆27Updated 7 months ago
verazuo / prompt-stealing-attack
[USENIX'24] Prompt Stealing Attacks Against Text-to-Image Generation Models
☆31Updated last week
thunlp / ONION
Official implementation of the EMNLP 2021 paper "ONION: A Simple and Effective Defense Against Textual Backdoor Attacks"
☆32Updated 3 years ago
Vaidehi99 / InfoDeletionAttacks
☆39Updated last year
lancopku / Embedding-Poisoning
Code for the paper "Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models" (NAACL-…
☆39Updated 3 years ago
mireshghallah / neighborhood-curvature-mia
☆21Updated last year
UCSC-VLAA / AttnGCG-attack
☆13Updated 3 months ago
eth-sri / smoothing-ensembles
[ICLR 2022] Boosting Randomized Smoothing with Variance Reduced Classifiers
☆12Updated 2 years ago
ethz-spylab / rlhf-poisoning
Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"
☆45Updated 8 months ago
xinleihe / Semi-Leak
☆11Updated 2 years ago
XuandongZhao / DRW
[EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP
☆12Updated last year
weizeming / momentum-attack-llm
☆14Updated 8 months ago
SCLBD / DBD
☆30Updated 2 years ago
kyleliang919 / Uncovering-the-Connections-BetweenAdversarial-Transferability-and-Knowledge-Transferability
code for ICML 2021 paper in which we explore the relationship between adversarial transferability and knowledge transferability.
☆17Updated 2 years ago
google-research / preprocessor-aware-black-box-attack
☆20Updated last year
sail-sg / AnyDoor
AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models
☆48Updated 9 months ago
shuaizhao95 / Prompt_attack
☆5Updated 7 months ago
cleverhans-lab / model-extraction-iclr
☆13Updated 2 years ago
hammlab / PoisoningCertifiedDefenses
How Robust are Randomized Smoothing based Defenses to Data Poisoning? (CVPR 2021)
☆13Updated 3 years ago