ZiyueWang25 / llm-security-challengeLinks
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
☆13Updated 2 years ago
Alternatives and similar repositories for llm-security-challenge
Users that are interested in llm-security-challenge are comparing it to the libraries listed below
Sorting:
- Whispers in the Machine: Confidentiality in Agentic Systems☆41Updated last month
- LLM security and privacy☆53Updated last year
- Risks and targets for assessing LLMs & LLM vulnerabilities☆33Updated last year
- LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins☆29Updated last year
- ☆113Updated last month
- Code used to run the platform for the LLM CTF colocated with SaTML 2024☆28Updated last year
- AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks☆64Updated 2 weeks ago
- This repository provides a benchmark for prompt injection attacks and defenses in LLMs☆381Updated 3 months ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]☆377Updated last year
- [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆64Updated 2 months ago
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆57Updated last year
- An Execution Isolation Architecture for LLM-Based Agentic Systems☆103Updated last year
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆564Updated last year
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…☆451Updated last year
- Agent Security Bench (ASB)☆177Updated 3 months ago
- ☆22Updated 2 years ago
- [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.☆184Updated 10 months ago
- Code to break Llama Guard☆32Updated 2 years ago
- The official implementation of our NAACL 2024 paper "A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Lang…☆152Updated 4 months ago
- The fastest Trust Layer for AI Agents☆149Updated 8 months ago
- Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers☆66Updated last year
- ☆55Updated last year
- A collection of prompt injection mitigation techniques.☆26Updated 2 years ago
- ☆75Updated last year
- Repo for the paper "Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks".☆44Updated this week
- This project investigates the security of large language models by performing binary classification of a set of input prompts to discover…☆56Updated 2 years ago
- TAP: An automated jailbreaking method for black-box LLMs☆217Updated last year
- CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…☆106Updated 2 weeks ago
- MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols☆27Updated 4 months ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆193Updated 9 months ago