hlzhang109 / impossibility-watermarkLinks

[ICML 2024] Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models

☆24

Alternatives and similar repositories for impossibility-watermark

Users that are interested in impossibility-watermark are comparing it to the libraries listed below

Sorting:

jthickstun / watermark
Code for watermarking language models
☆84Updated last year
facebookresearch / three_bricks
Official Implementation of the paper "Three Bricks to Consolidate Watermarks for LLMs"
☆50Updated last year
wagner-group / MarkMyWords
☆32Updated last year
inspire-group / DP-RandP
[NeurIPS 2023] Differentially Private Image Classification by Learning Priors from Random Processes
☆12Updated 2 years ago
THU-BPM / Robust_Watermark
Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.
☆35Updated last year
centerforaisafety / tdc2023-starter-kit
This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.
☆90Updated last year
XuandongZhao / Unigram-Watermark
[ICLR 2024] Provable Robust Watermarking for AI-Generated Text
☆37Updated last year
mmazeika / tdc-starter-kit
Starter kit and data loading code for the Trojan Detection Challenge NeurIPS 2022 competition
☆33Updated 2 years ago
chenchenygu / watermark-learnability
☆27Updated 9 months ago
BrachioLab / adversarial_prompting
☆53Updated 2 years ago
jinhaoduan / SecMI
[ICML 2023] Are Diffusion Models Vulnerable to Membership Inference Attacks?
☆42Updated last year
THU-BPM / unforgeable_watermark
Source code of paper "An Unforgeable Publicly Verifiable Watermark for Large Language Models" accepted by ICLR 2024
☆34Updated last year
hzy312 / Awesome-LLM-Watermark
UP-TO-DATE LLM Watermark paper. 🔥🔥🔥
☆365Updated 11 months ago
ethz-spylab / rlhf-poisoning
Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"
☆62Updated last year
YuxinWenRick / canary-in-a-coalmine
☆32Updated 2 years ago
ejones313 / auditing-llms
☆59Updated 2 years ago
ethz-spylab / autoadvexbench
☆33Updated 6 months ago
aengusl / latent-adversarial-training
☆46Updated last year
OPTML-Group / Unlearn-Sparse
[NeurIPS23 (Spotlight)] "Model Sparsity Can Simplify Machine Unlearning" by Jinghan Jia*, Jiancheng Liu*, Parikshit Ram, Yuguang Yao, Gao…
☆81Updated last year
pratyushmaini / llm_dataset_inference
Official Repository for Dataset Inference for LLMs
☆43Updated last year
neelsjain / baseline-defenses
Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"
☆30Updated 2 years ago
qingjiesjtu / USC
This is the code repository of our submission: Understanding the Dark Side of LLMs’ Intrinsic Self-Correction.
☆63Updated 11 months ago
S-Abdelnabi / awt
Code for our S&P'21 paper: Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
☆53Updated 3 years ago
ZhentingWang / DIAGNOSIS
☆23Updated last year
lancopku / codable-watermarking-for-llm
Repository for Towards Codable Watermarking for Large Language Models
☆38Updated 2 years ago
yepengliu / adaptive-text-watermark
[ICML2024] Adaptive Text Watermark for Large Language Models
☆23Updated 11 months ago
IBM / BadDiffusion
Official repo to reproduce the paper "How to Backdoor Diffusion Models?" published at CVPR 2023
☆94Updated 2 months ago
xiaoniu-578fa6bff964d005 / UnbiasedWatermark
☆40Updated last year
OPTML-Group / Unlearn-Saliency
[ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation…
☆139Updated 6 months ago
eth-sri / llmprivacy
☆70Updated 9 months ago