uiuc-arc / llm-code-watermark
LLM Program Watermarking
☆17Updated 11 months ago
Alternatives and similar repositories for llm-code-watermark:
Users that are interested in llm-code-watermark are comparing it to the libraries listed below
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆41Updated 2 months ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆67Updated last year
- Implementation of 'A Watermark for Large Language Models' paper by Kirchenbauer & Geiping et. al.☆23Updated 2 years ago
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆14Updated last week
- [ICML 2024] Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models☆22Updated 6 months ago
- Code for watermarking language models☆76Updated 6 months ago
- official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries☆31Updated 2 weeks ago
- Official Implementation of the paper "Three Bricks to Consolidate Watermarks for LLMs"☆46Updated last year
- Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers☆51Updated 7 months ago
- Code to break Llama Guard☆31Updated last year
- Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]☆66Updated 2 months ago
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆68Updated 4 months ago
- ☆22Updated 7 months ago
- ☆17Updated last year
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆30Updated 9 months ago
- Repository for Towards Codable Watermarking for Large Language Models☆36Updated last year
- ☆75Updated last week
- Official Repository for Dataset Inference for LLMs☆32Updated 8 months ago
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆43Updated 5 months ago
- TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…☆50Updated 3 weeks ago
- Dataset for the Tensor Trust project☆39Updated last year
- RepoQA: Evaluating Long-Context Code Understanding☆107Updated 5 months ago
- Code for paper "SrcMarker: Dual-Channel Source Code Watermarking via Scalable Code Transformations" (IEEE S&P 2024)☆25Updated 7 months ago
- ☆31Updated last year
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆135Updated 5 months ago
- [ICLR 2025] Dissecting Adversarial Robustness of Multimodal LM Agents☆79Updated last month
- EvoEval: Evolving Coding Benchmarks via LLM☆68Updated 11 months ago
- Code to generate NeuralExecs (prompt injection for LLMs)☆20Updated 4 months ago
- WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning m…☆108Updated 11 months ago
- All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks☆16Updated 11 months ago