uiuc-arc / llm-code-watermarkLinks
LLM Program Watermarking
☆17Updated last year
Alternatives and similar repositories for llm-code-watermark
Users that are interested in llm-code-watermark are comparing it to the libraries listed below
Sorting:
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆51Updated 2 months ago
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆14Updated 3 months ago
- ☆29Updated 10 months ago
- Code for watermarking language models☆79Updated 9 months ago
- Official Implementation of the paper "Three Bricks to Consolidate Watermarks for LLMs"☆48Updated last year
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆33Updated 11 months ago
- Code to break Llama Guard☆31Updated last year
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆61Updated last month
- Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]☆69Updated 5 months ago
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆145Updated 8 months ago
- [ICML 2024] Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models☆23Updated 9 months ago
- ☆36Updated 2 years ago
- Official repository for "PostMark: A Robust Blackbox Watermark for Large Language Models"☆27Updated 9 months ago
- ☆116Updated 11 months ago
- ☆26Updated last year
- Training and Benchmarking LLMs for Code Preference.☆33Updated 7 months ago
- CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context☆17Updated 8 months ago
- [ICLR 2025] Dissecting adversarial robustness of multimodal language model agents☆91Updated 4 months ago
- Official PyTorch implementation of "Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data" (NeurIPS'23)☆15Updated last year
- Open-source repository for the OOPSLA'24 paper "CYCLE: Learning to Self-Refine Code Generation"☆10Updated last year
- official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries☆39Updated 3 weeks ago
- Dataset for the Tensor Trust project☆43Updated last year
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆82Updated 6 months ago
- ☆64Updated last year
- ☆76Updated 3 months ago
- INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness☆13Updated 5 months ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆69Updated last year
- Privacy backdoors☆51Updated last year
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆65Updated 9 months ago
- FANC is a tool for the proof transfer of incomplete verification☆11Updated 3 years ago