ZiyueWang25 / llm-security-challengeLinks
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
☆13Updated 2 years ago
Alternatives and similar repositories for llm-security-challenge
Users that are interested in llm-security-challenge are comparing it to the libraries listed below
Sorting:
- Whispers in the Machine: Confidentiality in Agentic Systems☆40Updated 3 weeks ago
- LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins☆28Updated last year
- LLM security and privacy☆50Updated 10 months ago
- Risks and targets for assessing LLMs & LLM vulnerabilities☆32Updated last year
- Code to break Llama Guard☆32Updated last year
- ☆85Updated 9 months ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]☆336Updated 7 months ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆251Updated 3 weeks ago
- ☆61Updated 8 months ago
- CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…☆56Updated last month
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆524Updated 11 months ago
- AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks☆53Updated 3 months ago
- [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.☆167Updated 5 months ago
- Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers☆58Updated last year
- This repository provides a benchmark for prompt Injection attacks and defenses☆267Updated last month
- ☆145Updated 2 months ago
- Automated Safety Testing of Large Language Models☆16Updated 7 months ago
- TAP: An automated jailbreaking method for black-box LLMs☆184Updated 8 months ago
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆54Updated last year
- Universal Robustness Evaluation Toolkit (for Evasion)☆31Updated 3 months ago
- ☆48Updated 11 months ago
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…☆410Updated last year
- The official implementation of our NAACL 2024 paper "A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Lang…☆128Updated this week
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆141Updated 4 months ago
- ☆614Updated 2 months ago
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆114Updated last year
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆55Updated 10 months ago
- Code used to run the platform for the LLM CTF colocated with SaTML 2024☆26Updated last year
- ☆104Updated 4 months ago
- Repository for "SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques" publis…☆75Updated last year