lancopku / RAPLinks

Code for the paper "RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models" (EMNLP 2021)

☆25

Alternatives and similar repositories for RAP

Users that are interested in RAP are comparing it to the libraries listed below

Sorting:

lancopku / SOS
Code for the paper "Rethinking Stealthiness of Backdoor Attack against NLP Models" (ACL-IJCNLP 2021)
☆24Updated 3 years ago
thunlp / HiddenKiller
Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"
☆43Updated 3 years ago
lancopku / Embedding-Poisoning
Code for the paper "Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models" (NAACL-…
☆42Updated 4 years ago
thunlp / ONION
Official implementation of the EMNLP 2021 paper "ONION: A Simple and Effective Defense Against Textual Backdoor Attacks"
☆34Updated 3 years ago
papersPapers / BadPrompt
Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"
☆39Updated last year
leix28 / prompt-universal-vulnerability
Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022
☆30Updated 3 years ago
thunlp / NeuBA
☆25Updated 4 years ago
ShannonAI / backdoor_nlg
☆18Updated 4 years ago
PurduePAML / PICCOLO
☆26Updated 2 years ago
thunlp / StyleAttack
Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"
☆46Updated 3 years ago
lishaofeng / NLP_Backdoor
Hidden backdoor attack on NLP systems
☆46Updated 3 years ago
cnut1648 / Model-Fingerprint
Fingerprint large language models
☆41Updated last year
alvinchangw / CARA_EMNLP2020
Implementation for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder (EMNLP-Findings 2020)
☆15Updated 5 years ago
facebookresearch / text-adversarial-attack
Repo for arXiv preprint "Gradient-based Adversarial Attacks against Text Transformers"
☆109Updated 2 years ago
alps-lab / trojan-lm
TrojanLM: Trojaning Language Models for Fun and Profit
☆16Updated 4 years ago
xinleihe / toxic-prompt
☆27Updated last year
mireshghallah / neighborhood-curvature-mia
☆22Updated 2 years ago
RockyLzy / TextDefender
codes for "Searching for an Effective Defender:Benchmarking Defense against Adversarial Word Substitution"
☆31Updated last year
bangawayoo / nlp-watermarking
Robust natural language watermarking using invariant features
☆26Updated last year
Kiode / Text_Watermark
Watermarking Text Generated by Black-Box Language Models
☆39Updated last year
yfchen1994 / poisoning_membership
☆20Updated last year
HKUST-KnowComp / LLM-Multistep-Jailbreak
Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT
☆35Updated last year
PurduePAML / DBS
☆18Updated 3 years ago
XuandongZhao / DRW
[EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP
☆13Updated 2 years ago
ethz-spylab / rlhf-poisoning
Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"
☆60Updated last year
reds-lab / ASSET
This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…
☆19Updated 2 years ago
SCLBD / DBD
☆31Updated 3 years ago
AI-secure / Robustness-Against-Backdoor-Attacks
RAB: Provable Robustness Against Backdoor Attacks
☆39Updated 2 years ago
YancyKahn / CoA
Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM
☆36Updated 8 months ago
dugu9sword / dne
ACL 2021 - Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble
☆18Updated 2 years ago