uiuc-arc / llm-code-watermarkLinks
LLM Program Watermarking
☆17Updated last year
Alternatives and similar repositories for llm-code-watermark
Users that are interested in llm-code-watermark are comparing it to the libraries listed below
Sorting:
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆48Updated 2 months ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆69Updated last year
- ☆97Updated last year
- ☆26Updated 9 months ago
- Adversarial Attacks on GPT-4 via Simple Random Search [Dec 2023]☆42Updated last year
- Code for paper "SrcMarker: Dual-Channel Source Code Watermarking via Scalable Code Transformations" (IEEE S&P 2024)☆26Updated 9 months ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆60Updated 2 weeks ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆126Updated last month
- ☆75Updated 2 months ago
- Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers☆52Updated 9 months ago
- ☆173Updated last year
- official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries☆36Updated 2 weeks ago
- Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]☆69Updated 4 months ago
- [ICML 2024] Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models☆23Updated 8 months ago
- Code for watermarking language models☆79Updated 9 months ago
- [ICLR 2025] Dissecting adversarial robustness of multimodal language model agents☆88Updated 3 months ago
- An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)☆86Updated 4 months ago
- Dataset for the Tensor Trust project☆40Updated last year
- Privacy backdoors☆51Updated last year
- ☆64Updated 3 weeks ago
- Fluent student-teacher redteaming☆21Updated 10 months ago
- The official repository of the paper "On the Exploitability of Instruction Tuning".☆63Updated last year
- Watermark Stealing in Large Language Models (ICML '24)☆25Updated 11 months ago
- Official Implementation of the paper "Three Bricks to Consolidate Watermarks for LLMs"☆48Updated last year
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆31Updated 11 months ago
- ☆54Updated 2 years ago
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆14Updated 2 months ago
- Whispers in the Machine: Confidentiality in Agentic Systems☆37Updated 2 weeks ago
- ☆15Updated 2 months ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆93Updated last year