BrianPulfer / LMWatermarkLinks
Implementation of 'A Watermark for Large Language Models' paper by Kirchenbauer & Geiping et. al.
☆24Updated 2 years ago
Alternatives and similar repositories for LMWatermark
Users that are interested in LMWatermark are comparing it to the libraries listed below
Sorting:
- Code for watermarking language models☆82Updated last year
- Official Implementation of the paper "Three Bricks to Consolidate Watermarks for LLMs"☆48Updated last year
- Source code of paper "An Unforgeable Publicly Verifiable Watermark for Large Language Models" accepted by ICLR 2024☆35Updated last year
- [EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP☆13Updated 2 years ago
- The official repository of the paper "On the Exploitability of Instruction Tuning".☆64Updated last year
- ☆43Updated 2 years ago
- ☆39Updated 2 years ago
- Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.☆34Updated 10 months ago
- [ICCV 2023] Source code for our paper "Rickrolling the Artist: Injecting Invisible Backdoors into Text-Guided Image Generation Models".☆63Updated last year
- ☆45Updated 7 months ago
- ☆149Updated last year
- NeurIPS'24 - LLM Safety Landscape☆29Updated 7 months ago
- ☆14Updated last year
- [ICLR 2025] Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs☆19Updated 6 months ago
- Official Repository for Dataset Inference for LLMs☆41Updated last year
- Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks☆31Updated last year
- ☆39Updated last year
- [ICML 2024] Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models☆24Updated last year
- Watermarking Text Generated by Black-Box Language Models☆39Updated last year
- [TACL] Code for "Red Teaming Language Model Detectors with Language Models"☆23Updated last year
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆97Updated last year
- Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"☆39Updated last year
- [ICLR'24 Spotlight] DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer☆46Updated last year
- Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"☆59Updated last year
- Code of the paper: A Recipe for Watermarking Diffusion Models☆151Updated 10 months ago
- ☆23Updated 9 months ago
- ☆20Updated last year
- ☆22Updated 2 years ago
- Official repository for "PostMark: A Robust Blackbox Watermark for Large Language Models"☆27Updated last year
- Repo for arXiv preprint "Gradient-based Adversarial Attacks against Text Transformers"☆109Updated 2 years ago