lancopku / SOSLinks

Code for the paper "Rethinking Stealthiness of Backdoor Attack against NLP Models" (ACL-IJCNLP 2021)

☆24

Alternatives and similar repositories for SOS

Users that are interested in SOS are comparing it to the libraries listed below

Sorting:

lancopku / RAP
Code for the paper "RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models" (EMNLP 2021)
☆24Updated 3 years ago
thunlp / HiddenKiller
Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"
☆43Updated 2 years ago
papersPapers / BadPrompt
Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"
☆38Updated last year
thunlp / BkdAtk-LWS
Code and data of the ACL 2021 paper "Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution"
☆16Updated 4 years ago
ShannonAI / backdoor_nlg
☆18Updated 4 years ago
lishaofeng / NLP_Backdoor
Hidden backdoor attack on NLP systems
☆47Updated 3 years ago
thunlp / NeuBA
☆25Updated 4 years ago
lancopku / Embedding-Poisoning
Code for the paper "Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models" (NAACL-…
☆41Updated 3 years ago
cnut1648 / Model-Fingerprint
Fingerprint large language models
☆41Updated last year
thunlp / ONION
Official implementation of the EMNLP 2021 paper "ONION: A Simple and Effective Defense Against Textual Backdoor Attacks"
☆34Updated 3 years ago
alps-lab / trojan-lm
TrojanLM: Trojaning Language Models for Fun and Profit
☆16Updated 4 years ago
Kiode / Text_Watermark
Watermarking Text Generated by Black-Box Language Models
☆38Updated last year
PurduePAML / PICCOLO
☆25Updated 2 years ago
PurduePAML / DBS
☆18Updated 2 years ago
SCLBD / DBD
☆31Updated 3 years ago
leix28 / prompt-universal-vulnerability
Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022
☆29Updated 3 years ago
ethz-spylab / rlhf-poisoning
Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"
☆54Updated last year
thunlp / OpenBackdoor
An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)
☆184Updated 2 years ago
byerose / Awesome-Foundation-Model-Security
A curated list of trustworthy Generative AI papers. Daily updating...
☆73Updated 10 months ago
reds-lab / ASSET
This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…
☆18Updated 2 years ago
facebookresearch / text-adversarial-attack
Repo for arXiv preprint "Gradient-based Adversarial Attacks against Text Transformers"
☆107Updated 2 years ago
THU-BPM / Robust_Watermark
Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.
☆32Updated 8 months ago
AISafety-HKUST / Backdoor_Safety_Tuning
Backdoor Safety Tuning (NeurIPS 2023 & 2024 Spotlight)
☆26Updated 7 months ago
XuandongZhao / DRW
[EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP
☆13Updated last year
bangawayoo / nlp-watermarking
Robust natural language watermarking using invariant features
☆26Updated last year
leileigan / clean_label_textual_backdoor_attack
☆18Updated 3 years ago
csdongxian / ANP_backdoor
Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"
☆58Updated 2 years ago
HKUST-KnowComp / LLM-Multistep-Jailbreak
Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT
☆33Updated last year
Eyr3 / TextCRS
Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks (IEEE S&P 2024)
☆34Updated 2 weeks ago
thunlp / StyleAttack
Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"
☆43Updated 2 years ago