ZiyueWang25 / llm-security-challengeLinks
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
β13Updated 2 years ago
Alternatives and similar repositories for llm-security-challenge
Users that are interested in llm-security-challenge are comparing it to the libraries listed below
Sorting:
- Whispers in the Machine: Confidentiality in Agentic Systemsβ41Updated 3 weeks ago
 - π€π‘οΈπππ Tiny package designed to support red teams and penetration testers in exploiting large language model AI solutions.β27Updated last year
 - AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacksβ56Updated 5 months ago
 - LLM security and privacyβ51Updated last year
 - Code used to run the platform for the LLM CTF colocated with SaTML 2024β27Updated last year
 - LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Pluginsβ28Updated last year
 - β94Updated 11 months ago
 - Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]β359Updated 9 months ago
 - Risks and targets for assessing LLMs & LLM vulnerabilitiesβ32Updated last year
 - An Execution Isolation Architecture for LLM-Based Agentic Systemsβ97Updated 9 months ago
 - [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"β162Updated 6 months ago
 - PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to aβ¦β429Updated last year
 - [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.β172Updated 7 months ago
 - Code to break Llama Guardβ32Updated last year
 - β109Updated 6 months ago
 - β85Updated last year
 - The official implementation of our NAACL 2024 paper "A Wolf in Sheepβs Clothing: Generalized Nested Jailbreak Prompts can Fool Large Langβ¦β141Updated 2 months ago
 - Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Promptsβ533Updated last year
 - Fine-tuning base models to build robust task-specific modelsβ34Updated last year
 - β65Updated 10 months ago
 - This repository provides a benchmark for prompt injection attacks and defensesβ318Updated this week
 - Implementation for "RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content"β22Updated last year
 - This project explores training data extraction attacks on the LLaMa 7B, GPT-2XL, and GPT-2-IMDB models to discover memorized content usinβ¦β15Updated 2 years ago
 - A collection of prompt injection mitigation techniques.β24Updated 2 years ago
 - Agent Security Bench (ASB)β141Updated last week
 - A repository of Language Model Vulnerabilities and Exposures (LVEs).β112Updated last year
 - TAP: An automated jailbreaking method for black-box LLMsβ194Updated 10 months ago
 - PAL: Proxy-Guided Black-Box Attack on Large Language Modelsβ55Updated last year
 - A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.β87Updated last year
 - β52Updated last year