briland / LLM-security-and-privacyLinks

LLM security and privacy

☆52

Alternatives and similar repositories for LLM-security-and-privacy

Users that are interested in LLM-security-and-privacy are comparing it to the libraries listed below

Sorting:

liu00222 / Open-Prompt-Injection
This repository provides a benchmark for prompt injection attacks and defenses
☆352Updated last month
Libr-AI / OpenRedTeaming
Papers about red teaming LLMs and Multimodal models.
☆156Updated 6 months ago
CryptoAILab / JailbreakEval
[NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.
☆172Updated 8 months ago
AI-secure / AgentPoison
[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"
☆175Updated 7 months ago
leondz / lm_risk_cards
Risks and targets for assessing LLMs & LLM vulnerabilities
☆33Updated last year
BHui97 / PLeak
☆70Updated 11 months ago
agencyenterprise / PromptInject
PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…
☆438Updated last year
tml-epfl / llm-adaptive-attacks
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
☆365Updated 10 months ago
AIM-Intelligence / Automated-Multi-Turn-Jailbreaks
☆105Updated this week
vinusankars / BEAST
Implementation of BEAST adversarial attack for language models (ICML 2024)
☆91Updated last year
ethz-spylab / agentdojo
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
☆369Updated last week
LostOxygen / llm-confidentiality
Whispers in the Machine: Confidentiality in Agentic Systems
☆41Updated this week
chawins / llm-sp
Papers and resources related to the security and privacy of LLMs 🤖
☆552Updated 6 months ago
microsoft / BIPIA
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
☆90Updated last year
sinanw / llm-security-prompt-injection
This project investigates the security of large language models by performing binary classification of a set of input prompts to discover…
☆53Updated last year
uiuc-kang-lab / InjecAgent
☆95Updated last year
sherdencooper / GPTFuzz
Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
☆545Updated last year
agiresearch / ASB
Agent Security Bench (ASB)
☆149Updated last month
patrickrchao / JailbreakingLLMs
☆656Updated 5 months ago
RICommunity / TAP
TAP: An automated jailbreaking method for black-box LLMs
☆200Updated 11 months ago
Yu-Fangxu / COLD-Attack
[ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
☆172Updated 11 months ago
lancopku / agent-backdoor-attacks
Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]
☆102Updated last year
ZhengyuZhao / AI-Security-and-Privacy-Events
A curated list of academic events on AI Security & Privacy
☆166Updated last year
SheltonLiu-N / Universal-Prompt-Injection
The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".
☆64Updated last year
microsoft / TaskTracker
TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…
☆77Updated 3 months ago
usail-hkust / JailTrickBench
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)
☆155Updated last year
lakeraai / pint-benchmark
A benchmark for prompt injection detection systems.
☆151Updated 3 months ago
deadbits / vigil-llm
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
☆431Updated last year
mnns / LLMFuzzer
🧠 LLMFuzzer - Fuzzing Framework for Large Language Models 🧠 LLMFuzzer is the first open-source fuzzing framework specifically designed …
☆328Updated last year
llm-platform-security / SecGPT
An Execution Isolation Architecture for LLM-Based Agentic Systems
☆100Updated 10 months ago