eva-giboulot / WaterMax
A plug-&-play watermark for LLMs with no impact on text quality.
☆5Updated 6 months ago
Alternatives and similar repositories for WaterMax:
Users that are interested in WaterMax are comparing it to the libraries listed below
- [ICLR 2024] Provable Robust Watermarking for AI-Generated Text☆30Updated last year
- Official repository for "PostMark: A Robust Blackbox Watermark for Large Language Models"☆24Updated 7 months ago
- Official Implementation of the paper "Three Bricks to Consolidate Watermarks for LLMs"☆46Updated last year
- ☆53Updated 2 years ago
- What do we learn from inverting CLIP models?☆53Updated last year
- Code for watermarking language models☆76Updated 6 months ago
- ☆30Updated 3 months ago
- Codebase for decoding compressed trust.☆23Updated 10 months ago
- ☆42Updated last month
- Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)☆59Updated 2 months ago
- ☆21Updated 2 weeks ago
- Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.☆28Updated 4 months ago
- Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"☆44Updated 2 months ago
- ☆53Updated 8 months ago
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆83Updated 8 months ago
- [ICLR'25 Spotlight] Min-K%++: Improved baseline for detecting pre-training data of LLMs☆37Updated last month
- Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"☆24Updated last year
- ☆27Updated 9 months ago
- Official Repository for Dataset Inference for LLMs☆32Updated 8 months ago
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆28Updated last year
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆67Updated last year
- ☆32Updated 6 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆75Updated 5 months ago
- ☆24Updated last month
- ☆22Updated last month
- Official Pytorch repo of CVPR'23 and NeurIPS'23 papers on understanding replication in diffusion models.☆105Updated last year
- ☆18Updated last year
- This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.☆85Updated 10 months ago
- The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'☆13Updated 3 weeks ago
- ☆25Updated 2 weeks ago