hzy312 / Awesome-LLM-WatermarkLinks

UP-TO-DATE LLM Watermark paper. 🔥🔥🔥

☆357

Alternatives and similar repositories for Awesome-LLM-Watermark

Users that are interested in Awesome-LLM-Watermark are comparing it to the libraries listed below

Sorting:

git-disl / awesome_LLM-harmful-fine-tuning-papers
A survey on harmful fine-tuning attack for large language model
☆212Updated this week
Xianjun-Yang / Awesome_papers_on_LLMs_detection
The lastest paper about detection of LLM-generated text and code
☆277Updated 3 months ago
THU-KEG / WaterBench
[ACL2024-Main] Data and Code for WaterBench: Towards Holistic Evaluation of LLM Watermarks
☆28Updated last year
lancopku / codable-watermarking-for-llm
Repository for Towards Codable Watermarking for Large Language Models
☆38Updated 2 years ago
xiaoniu-578fa6bff964d005 / UnbiasedWatermark
☆39Updated last year
kevinyaobytedance / llm_unlearn
LLM Unlearning
☆175Updated last year
hongcheki / sweet-watermark
Official repository of the paper: Who Wrote this Code? Watermarking for Code Generation (ACL 2024)
☆38Updated last year
chrisliu298 / awesome-llm-unlearning
A resource repository for machine unlearning in large language models
☆493Updated 2 months ago
jwkirchenbauer / lm-watermarking
☆629Updated 3 weeks ago
jthickstun / watermark
Code for watermarking language models
☆82Updated last year
THU-BPM / Robust_Watermark
Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.
☆34Updated 10 months ago
SproutNan / AI-Safety_SCAV
This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"
☆44Updated 10 months ago
inspire-group / RobustRAG
☆20Updated last year
abehou / SemStamp
Repo for SemStamp (NAACL2024) and k-SemStamp (ACL2024)
☆23Updated 10 months ago
bangawayoo / mb-lm-watermarking
multi-bit language model watermarking (NAACL 24)
☆15Updated last year
niconi19 / LLM-Conversation-Safety
[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
☆106Updated last year
THU-BPM / unforgeable_watermark
Source code of paper "An Unforgeable Publicly Verifiable Watermark for Large Language Models" accepted by ICLR 2024
☆35Updated last year
facebookresearch / advprompter
Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873
☆166Updated last year
facebookresearch / three_bricks
Official Implementation of the paper "Three Bricks to Consolidate Watermarks for LLMs"
☆48Updated last year
thu-coai / JailbreakDefense_GoalPriority
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
☆29Updated last year
qingjiesjtu / USC
This is the code repository of our submission: Understanding the Dark Side of LLMs’ Intrinsic Self-Correction.
☆63Updated 9 months ago
eth-sri / llmprivacy
☆68Updated 7 months ago
isXinLiu / MM-SafetyBench
Accepted by ECCV 2024
☆158Updated 11 months ago
bangawayoo / nlp-watermarking
Robust natural language watermarking using invariant features
☆26Updated last year
KID-22 / LLM-Unlearning-Paper-List
☆28Updated last year
iamgroot42 / mimir
Python package for measuring memorization in LLMs.
☆167Updated 2 months ago
sleeepeer / PoisonedRAG
[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
☆202Updated 7 months ago
byerose / Awesome-Foundation-Model-Security
A curated list of trustworthy Generative AI papers. Daily updating...
☆74Updated last year
LLM-Tuning-Safety / LLMs-Finetuning-Safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20…
☆323Updated last year
OpenSafetyLab / SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
☆161Updated 7 months ago