zhuhong1996 / AI-Guardian
This repository contains code implementation of the paper "AI-Guardian: Defeating Adversarial Attacks using Backdoors, at IEEE Security and Privacy 2023.
☆12Updated last year
Related projects: ⓘ
- ☆20Updated 2 years ago
- ☆60Updated 3 years ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆48Updated last week
- ☆42Updated last year
- Code for Voice Jailbreak Attacks Against GPT-4o.☆20Updated 3 months ago
- This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Secur…☆11Updated 2 years ago
- Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents"☆29Updated 3 months ago
- Universal Robustness Evaluation Toolkit (for Evasion)☆29Updated 6 months ago
- ☆20Updated last year
- Hidden backdoor attack on NLP systems☆45Updated 2 years ago
- ☆73Updated 3 years ago
- Code for "Adversarial Illusions in Multi-Modal Embeddings"☆10Updated last month
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆27Updated 5 months ago
- ☆23Updated 2 years ago
- Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples☆17Updated 2 years ago
- This is the official implementation of our paper 'Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protecti…☆47Updated 6 months ago
- [S&P'24] Test-Time Poisoning Attacks Against Test-Time Adaptation Models☆14Updated 7 months ago
- Honest-but-Curious Nets: Sensitive Attributes of Private Inputs Can Be Secretly Coded into the Classifiers' Outputs (ACM CCS'21)☆17Updated last year
- ☆17Updated 6 months ago
- ☆9Updated 3 years ago
- BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models☆28Updated 2 weeks ago
- Machine Learning & Security Seminar @Purdue University☆25Updated last year
- [ICLR'21] Dataset Inference for Ownership Resolution in Machine Learning☆30Updated last year
- Code release for DeepJudge (S&P'22)☆50Updated last year
- Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"☆32Updated 2 months ago
- AdvDoor: Adversarial Backdoor Attack of Deep Learning System☆30Updated last year
- Repository for Knowledge Enhanced Machine Learning Pipeline (KEMLP)☆10Updated 3 years ago
- Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks (IEEE S&P 2024)☆29Updated 6 months ago
- Code for our S&P'21 paper: Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding☆45Updated last year
- [USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models☆61Updated last week