uiuc-arc / llm-code-watermarkLinks
LLM Program Watermarking
☆18Updated last year
Alternatives and similar repositories for llm-code-watermark
Users that are interested in llm-code-watermark are comparing it to the libraries listed below
Sorting:
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆70Updated last month
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆65Updated 3 months ago
- Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]☆74Updated 7 months ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆70Updated last year
- Code for watermarking language models☆82Updated last year
- Official Implementation of the paper "Three Bricks to Consolidate Watermarks for LLMs"☆48Updated last year
- Repoformer: Selective Retrieval for Repository-Level Code Completion (ICML 2024)☆59Updated 3 months ago
- The official repository of the paper "On the Exploitability of Instruction Tuning".☆64Updated last year
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆91Updated 9 months ago
- An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)☆99Updated 7 months ago
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆15Updated 5 months ago
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆143Updated last year
- ☆39Updated 2 years ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆148Updated 5 months ago
- INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness☆15Updated 8 months ago
- ☆44Updated 7 months ago
- ☆183Updated last year
- Code to break Llama Guard☆32Updated last year
- ☆23Updated 8 months ago
- ☆107Updated last year
- ☆34Updated last year
- Python package for measuring memorization in LLMs.☆166Updated 2 months ago
- ☆46Updated 6 months ago
- [ICLR 2025] Dissecting adversarial robustness of multimodal language model agents☆102Updated 6 months ago
- Implementation of 'A Watermark for Large Language Models' paper by Kirchenbauer & Geiping et. al.☆24Updated 2 years ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆97Updated last year
- Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.☆34Updated 10 months ago
- Official repository of the paper: Who Wrote this Code? Watermarking for Code Generation (ACL 2024)☆38Updated last year
- The official repository for guided jailbreak benchmark☆18Updated last month
- [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆48Updated 2 months ago