schroederdewitt / perfectly-secure-steganography
Contains open source code for the paper "Perfectly-secure Steganography using Minimum Entropy Coupling"
☆53Updated last year
Alternatives and similar repositories for perfectly-secure-steganography:
Users that are interested in perfectly-secure-steganography are comparing it to the libraries listed below
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆109Updated 9 months ago
- ☆34Updated last year
- Fluent student-teacher redteaming☆20Updated 8 months ago
- Code to break Llama Guard☆31Updated last year
- ☆288Updated last year
- ☆26Updated last month
- Starter kit and data loading code for the Trojan Detection Challenge NeurIPS 2022 competition☆33Updated last year
- ☆94Updated last year
- ☆67Updated last year
- ☆53Updated 2 years ago
- Code for our S&P'21 paper: Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding☆51Updated 2 years ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆67Updated last year
- This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.☆85Updated 10 months ago
- What do we learn from inverting CLIP models?☆53Updated last year
- Improving Alignment and Robustness with Circuit Breakers☆192Updated 6 months ago
- Independent robustness evaluation of Improving Alignment and Robustness with Short Circuiting☆16Updated 7 months ago
- Privacy backdoors☆51Updated 11 months ago
- Codebase for Obfuscated Activations Bypass LLM Latent-Space Defenses☆15Updated last month
- ☆26Updated last year
- Adversarial Attacks on GPT-4 via Simple Random Search [Dec 2023]☆43Updated 11 months ago
- Implementing RASP transformer programming language https://arxiv.org/pdf/2106.06981.pdf.☆52Updated 3 years ago
- Pytorch Datasets for Easy-To-Hard☆27Updated 2 months ago
- ☆263Updated last year
- ☆33Updated 6 months ago
- A library for mechanistic anomaly detection☆21Updated 2 months ago
- Keeping language models honest by directly eliciting knowledge encoded in their activations.☆197Updated last week
- A plug-&-play watermark for LLMs with no impact on text quality.☆5Updated 6 months ago
- [ICML 2024] Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models☆22Updated 6 months ago
- Discount jupyter.☆50Updated 3 weeks ago
- Code for watermarking language models☆76Updated 6 months ago