xlhex / cater_neuripsLinks

☆6

Alternatives and similar repositories for cater_neurips

Users that are interested in cater_neurips are comparing it to the libraries listed below

Sorting:

weichen-yu / LM-Extraction
☆44Updated 2 years ago
mireshghallah / neighborhood-curvature-mia
☆21Updated last year
xinleihe / toxic-prompt
☆24Updated last year
leix28 / prompt-universal-vulnerability
Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022
☆29Updated 3 years ago
Vaidehi99 / InfoDeletionAttacks
☆44Updated 5 months ago
xiangyue9607 / Sentence-LDP
Code for the WWW'23 paper "Sanitizing Sentence Embeddings (and Labels) for Local Differential Privacy"
☆12Updated 2 years ago
lancopku / RAP
Code for the paper "RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models" (EMNLP 2021)
☆24Updated 3 years ago
lancopku / SOS
Code for the paper "Rethinking Stealthiness of Backdoor Attack against NLP Models" (ACL-IJCNLP 2021)
☆24Updated 3 years ago
AI-secure / Robustness-Against-Backdoor-Attacks
RAB: Provable Robustness Against Backdoor Attacks
☆39Updated last year
papersPapers / BadPrompt
Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"
☆38Updated last year
yfchen1994 / poisoning_membership
☆20Updated last year
alvinchangw / CARA_EMNLP2020
Implementation for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder (EMNLP-Findings 2020)
☆15Updated 4 years ago
yjw1029 / Self-Reminder-Data
Data for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder"
☆19Updated last year
wyshi / lm_privacy
☆18Updated 3 years ago
DingfanChen / RelaxLoss
Official implementation of "RelaxLoss: Defending Membership Inference Attacks without Losing Utility" (ICLR 2022)
☆50Updated 2 years ago
Sanghyun-Hong / Gradient-Shaping
[Preprint] On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping
☆10Updated 5 years ago
phycholosogy / RAG-privacy
The code for paper "The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)", exploring the privacy risk o…
☆50Updated 5 months ago
huseyinatahaninan / Differentially-Private-Fine-tuning-of-Language-Models
☆74Updated 3 years ago
DennisLiu2022 / Membership-Inference-Attacks-by-Exploiting-Loss-Trajectory
☆24Updated 2 years ago
thunlp / NeuBA
☆25Updated 4 years ago
XuandongZhao / DRW
[EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP
☆13Updated last year
pratyushmaini / llm_dataset_inference
Official Repository for Dataset Inference for LLMs
☆35Updated 11 months ago
wagner-group / MarkMyWords
☆30Updated last year
xlhex / dpnlp
☆9Updated 4 years ago
SolidShen / RIPPLE_official
☆20Updated last year
lapisrocks / rpo
Official repository for "Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks"
☆55Updated 11 months ago
ethz-spylab / rlhf-poisoning
Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"
☆54Updated last year
Jayfeather1024 / Backdoor-Enhanced-Alignment
☆20Updated 7 months ago
HKUST-KnowComp / LLM-Multistep-Jailbreak
Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT
☆33Updated last year
VITA-Group / DP-OPT
[ICLR'24 Spotlight] DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer
☆44Updated last year