casperllm / CASPERLinks

☆15

Alternatives and similar repositories for CASPER

Users that are interested in CASPER are comparing it to the libraries listed below

Sorting:

ltroin / llm_attack_defense_arena
☆82Updated last year
thunlp / HiddenKiller
Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"
☆43Updated 2 years ago
ledllm / ledllm
☆20Updated last year
Lyz1213 / BadEdit
☆30Updated 9 months ago
Lyz1213 / Backdoored_PPLM
☆14Updated last year
wegodev2 / virtual-prompt-injection
Unofficial implementation of "Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection"
☆20Updated last year
YancyKahn / CoA
Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM
☆34Updated 6 months ago
lancopku / codable-watermarking-for-llm
Repository for Towards Codable Watermarking for Large Language Models
☆37Updated last year
NY1024 / Foundation-Model-Paper-Notes
☆56Updated last month
Raytsang123 / CLIBE
[NDSS 2025] "CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models"
☆15Updated 7 months ago
wang2226 / Trojan-Activation-Attack
[CIKM 2024] Trojan Activation Attack: Attack Large Language Models using Activation Steering for Safety-Alignment.
☆25Updated 11 months ago
neelsjain / baseline-defenses
Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"
☆25Updated last year
AI45Lab / CodeAttack
[ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion
☆47Updated 8 months ago
thu-coai / JailbreakDefense_GoalPriority
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
☆26Updated last year
PurduePAML / DBS
☆18Updated 2 years ago
LLMSecurity / MasterKey
MASTERKEY is a framework designed to explore and exploit vulnerabilities in large language model chatbots by automating jailbreak attacks…
☆24Updated 10 months ago
YihanWang617 / llm-jailbreaking-defense
A lightweight library for large laguage model (LLM) jailbreaking defense.
☆52Updated 8 months ago
meng-wenlong / LMSanitator
☆21Updated last year
OSU-NLP-Group / AmpleGCG
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM
☆68Updated 8 months ago
sleeepeer / PoisonedRAG
[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
☆170Updated 4 months ago
yuplin2333 / representation-space-jailbreak
Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…
☆20Updated 11 months ago
niconi19 / LLM-Conversation-Safety
[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
☆104Updated 11 months ago
AI45Lab / ActorAttack
☆91Updated 5 months ago
kriti-hippo / red_queen
Red Queen Dataset and data generation template
☆16Updated 9 months ago
thunlp / OpenBackdoor
An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)
☆184Updated 2 years ago
lancopku / agent-backdoor-attacks
Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]
☆81Updated 9 months ago
reds-lab / ASSET
This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…
☆18Updated 2 years ago
shiningrain / JailGuard
☆16Updated 4 months ago
Gwinhen / DRUPE
Distribution Preserving Backdoor Attack in Self-supervised Learning
☆16Updated last year
MarkGHX / BiScope
Official Implementation of NeurIPS 2024 paper - BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens
☆20Updated 3 months ago