XuandongZhao / Ginsew
[ICML 2023] Protecting Language Generation Models via Invisible Watermarking
☆13Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Ginsew
- ☆15Updated 6 months ago
- Code Repo for the NeurIPS 2023 paper "VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models"☆19Updated 2 months ago
- Code for paper: "PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification", IEEE S&P 2024.☆28Updated 3 months ago
- [MM'23 Oral] "Text-to-image diffusion models can be easily backdoored through multimodal data poisoning"☆22Updated 2 months ago
- [CVPR 2023] Backdoor Defense via Adaptively Splitting Poisoned Dataset☆44Updated 7 months ago
- Implementation of BadCLIP https://arxiv.org/pdf/2311.16194.pdf☆17Updated 7 months ago
- Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation (NeurIPS 2022)☆33Updated last year
- [CVPR23W] "A Pilot Study of Query-Free Adversarial Attack against Stable Diffusion" by Haomin Zhuang, Yihua Zhang and Sijia Liu☆24Updated 2 months ago
- ☆20Updated last year
- ☆26Updated 4 months ago
- Github repo for One-shot Neural Backdoor Erasing via Adversarial Weight Masking (NeurIPS 2022)☆14Updated last year
- This code is the official implementation of WEvade.☆37Updated 8 months ago
- ☆20Updated 4 months ago
- Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…☆12Updated 3 months ago
- official implementation of Towards Robust Model Watermark via Reducing Parametric Vulnerability☆12Updated 5 months ago
- Backdoor Safety Tuning (NeurIPS 2023 & 2024 Spotlight)☆24Updated this week
- Official Implementation of NIPS 2022 paper Pre-activation Distributions Expose Backdoor Neurons☆14Updated last year
- ☆17Updated 2 years ago
- Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107☆12Updated 3 months ago
- Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023)☆28Updated 10 months ago
- Robust natural language watermarking using invariant features☆25Updated last year
- [ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images☆24Updated 9 months ago
- Code for the paper "Autoregressive Perturbations for Data Poisoning" (NeurIPS 2022)☆18Updated 2 months ago
- ☆31Updated 2 years ago
- The official implementation of our CVPR 2023 paper "Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consist…☆19Updated last year
- Official Tensorflow implementation for "Improving Adversarial Transferability via Neuron Attribution-based Attacks" (CVPR 2022)☆33Updated last year
- ☆28Updated 2 years ago
- ☆21Updated 5 months ago
- Source code of paper "An Unforgeable Publicly Verifiable Watermark for Large Language Models" accepted by ICLR 2024☆28Updated 5 months ago
- [ECCV-2024] Transferable Targeted Adversarial Attack, CLIP models, Generative adversarial network, Multi-target attacks☆22Updated 3 months ago