AhmedSalem2 / Model-HijackingLinks

The code will be released soon!

☆6

Alternatives and similar repositories for Model-Hijacking

Users that are interested in Model-Hijacking are comparing it to the libraries listed below

Sorting:

Lyz1213 / BadEdit
☆32Updated 9 months ago
ZhangZhuoSJTU / LINT
☆16Updated 11 months ago
Django-Jiang / BadChain
[ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
☆36Updated last year
Unispac / Fight-Poison-With-Poison
Code repository for the paper --- [USENIX Security 2023] Towards A Proactive ML Approach for Detecting Backdoor Poison Samples
☆27Updated 2 years ago
liuyugeng / ML-Doctor
Code for ML Doctor
☆91Updated 11 months ago
SolidShen / BAIT
🔥🔥🔥 Detecting hidden backdoors in Large Language Models with only black-box access
☆35Updated 2 months ago
BHui97 / PLeak
☆60Updated 7 months ago
zhangrui4041 / Instruction_Backdoor_Attack
☆25Updated 11 months ago
lancopku / agent-backdoor-attacks
Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]
☆85Updated 10 months ago
chen37058 / Red-Team-Arxiv-Paper-Update
Awesome Jailbreak, red teaming arxiv papers (Automatically Update Every 12th hours)
☆44Updated last week
WUSTL-CSPL / LLMJailbreak
☆35Updated 10 months ago
jianshuod / Engorgio-prompt
The official code for ``An Engorgio Prompt Makes Large Language Model Babble on''
☆14Updated 5 months ago
byerose / Awesome-Foundation-Model-Security
A curated list of trustworthy Generative AI papers. Daily updating...
☆73Updated 11 months ago
Sizhe-Chen / StruQ
official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries
☆44Updated 2 weeks ago
Testing4AI / DeepJudge
Code release for DeepJudge (S&P'22)
☆51Updated 2 years ago
Kooscii / BadNets
☆99Updated 4 years ago
KaiyuanZh / CENSOR
[NDSS 2025] CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling
☆15Updated 6 months ago
GiantSeaweed / DECREE
Official repository for CVPR'23 paper: Detecting Backdoors in Pre-trained Encoders
☆35Updated last year
Gwinhen / BackdoorVault
A toolbox for backdoor attacks.
☆22Updated 2 years ago
shuita2333 / AutoDoS
Consuming Resrouce via Auto-generation for LLM-DoS Attack under Black-box Settings
☆14Updated 7 months ago
rucnyz / PrivAgent
☆19Updated 2 months ago
sleeepeer / PoisonedRAG
[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
☆176Updated 5 months ago
DeepLearningSecurityGroup / Cyber_Security_Reading_Group
☆12Updated last week
ThuCCSLab / misalignment
[NDSS'25] The official implementation of safety misalignment.
☆16Updated 7 months ago
PurduePAML / Machine-Learning-Security-Seminar
Machine Learning & Security Seminar @Purdue University
☆25Updated 2 years ago
rotaryhammer / code-autodan
An unofficial implementation of AutoDAN attack on LLMs (arXiv:2310.15140)
☆42Updated last year
bboylyg / ABL
Anti-Backdoor learning (NeurIPS 2021)
☆82Updated 2 years ago
reza321 / T-Miner
☆19Updated last year
bboylyg / BackdoorLLM
BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models
☆188Updated last month
reds-lab / BEEAR
This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Lang…
☆17Updated last year