ZiyueWang25 / llm-security-challengeLinks
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
β13Updated last year
Alternatives and similar repositories for llm-security-challenge
Users that are interested in llm-security-challenge are comparing it to the libraries listed below
Sorting:
- Whispers in the Machine: Confidentiality in Agentic Systemsβ39Updated last month
- π€π‘οΈπππ Tiny package designed to support red teams and penetration testers in exploiting large language model AI solutions.β23Updated last year
- LLM security and privacyβ48Updated 8 months ago
- Code to break Llama Guardβ31Updated last year
- PAL: Proxy-Guided Black-Box Attack on Large Language Modelsβ51Updated 10 months ago
- [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.β158Updated 2 months ago
- This repository provides a benchmark for prompt Injection attacks and defensesβ232Updated 3 weeks ago
- CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents onβ¦β30Updated this week
- β74Updated 7 months ago
- AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacksβ48Updated last month
- Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakersβ52Updated 10 months ago
- Security Attacks on LLM-based Code Completion Tools (AAAI 2025)β19Updated last month
- β58Updated 6 months ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.β188Updated last week
- LLM Self Defense: By Self Examination, LLMs know they are being trickedβ34Updated last year
- β66Updated 11 months ago
- TAP: An automated jailbreaking method for black-box LLMsβ173Updated 6 months ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"β130Updated 2 months ago
- Risks and targets for assessing LLMs & LLM vulnerabilitiesβ30Updated last year
- LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Pluginsβ25Updated 10 months ago
- Repository for "SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques" publisβ¦β72Updated last year
- Agent Security Bench (ASB)β89Updated last week
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Promptsβ501Updated 9 months ago
- Code to generate NeuralExecs (prompt injection for LLMs)β22Updated 7 months ago
- [ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllabilityβ154Updated 6 months ago
- Papers about red teaming LLMs and Multimodal models.β123Updated 3 weeks ago
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".β49Updated 8 months ago
- Implementation of BEAST adversarial attack for language models (ICML 2024)β88Updated last year
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]β316Updated 5 months ago
- Universal Robustness Evaluation Toolkit (for Evasion)β31Updated last month