bohanhou14 / SemStamp

Repo for SemStamp (NAACL2024) and k-SemStamp (ACL2024)

☆17

Alternatives and similar repositories for SemStamp:

Users that are interested in SemStamp are comparing it to the libraries listed below

bangawayoo / mb-lm-watermarking
multi-bit language model watermarking (NAACL 24)
☆11Updated 4 months ago
xiaojunxu / learning-to-watermark-llm
☆16Updated 10 months ago
lancopku / codable-watermarking-for-llm
Repository for Towards Codable Watermarking for Large Language Models
☆34Updated last year
bangawayoo / nlp-watermarking
Robust natural language watermarking using invariant features
☆26Updated last year
THU-BPM / Robust_Watermark
Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.
☆26Updated 2 months ago
thunlp / HiddenKiller
Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"
☆41Updated 2 years ago
Lyz1213 / BadEdit
☆21Updated 3 months ago
THU-KEG / WaterBench
[ACL2024-Main] Data and Code for WaterBench: Towards Holistic Evaluation of LLM Watermarks
☆21Updated last year
hongcheki / sweet-watermark
Official repository of the paper: Who Wrote this Code? Watermarking for Code Generation (ACL 2024)
☆32Updated 8 months ago
git-disl / awesome_LLM-harmful-fine-tuning-papers
A survey on harmful fine-tuning attack for large language model
☆129Updated 2 weeks ago
MiracleHH / CBA
Composite Backdoor Attacks Against Large Language Models
☆11Updated 9 months ago
shaoshuo-ss / EaaW
Official code for our NDSS paper "Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarkin…
☆25Updated 2 months ago
ruisizhang123 / REMARK-LLM
[USENIX Security'24] REMARK-LLM: A robust and efficient watermarking framework for generative large language models
☆19Updated 3 months ago
xlhex / NLG_api_watermark
☆9Updated 2 years ago
wang2226 / Trojan-Activation-Attack
[CIKM 2024] Trojan Activation Attack: Attack Large Language Models using Activation Steering for Safety-Alignment.
☆20Updated 6 months ago
byerose / Awesome-Foundation-Model-Security
A curated list of trustworthy Generative AI papers. Daily updating...
☆68Updated 4 months ago
THU-BPM / unforgeable_watermark
Source code of paper "An Unforgeable Publicly Verifiable Watermark for Large Language Models" accepted by ICLR 2024
☆31Updated 8 months ago
ThuCCSLab / MergeGuard
[CCS-LAMPS'24] LLM IP Protection Against Model Merging
☆11Updated 3 months ago
isXinLiu / MM-SafetyBench
Accepted by ECCV 2024
☆92Updated 3 months ago
mignonjia / TS_watermark
☆13Updated 3 months ago
qingjiesjtu / USC
This is the code repository of our submission: Understanding the Dark Side of LLMs’ Intrinsic Self-Correction.
☆53Updated last month
Django-Jiang / BadChain
[ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
☆23Updated 6 months ago
xiaoniu-578fa6bff964d005 / UnbiasedWatermark
☆34Updated 5 months ago
hzy312 / Awesome-LLM-Watermark
UP-TO-DATE LLM Watermark paper. 🔥🔥🔥
☆320Updated last month
grasses / PromptCARE
Code for paper: "PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification", IEEE S&P 2024.
☆30Updated 5 months ago
MartinPawel / In-Context-Unlearning
"In-Context Unlearning: Language Models as Few Shot Unlearners". Martin Pawelczyk, Seth Neel* and Himabindu Lakkaraju*; ICML 2024.
☆21Updated last year
thu-coai / JailbreakDefense_GoalPriority
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
☆18Updated 6 months ago
NY1024 / Foundation-Model-Paper-Notes
☆33Updated last month
chenchenygu / watermark-learnability
☆22Updated 8 months ago
jthickstun / watermark
Code for watermarking language models
☆76Updated 4 months ago