ZiyueWang25 / llm-security-challenge
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
☆12Updated last year
Alternatives and similar repositories for llm-security-challenge:
Users that are interested in llm-security-challenge are comparing it to the libraries listed below
- Whispers in the Machine: Confidentiality in LLM-integrated Systems☆32Updated last week
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆47Updated 5 months ago
- Code to break Llama Guard☆31Updated last year
- 🤖🛡️🔍🔒🔑 Tiny package designed to support red teams and penetration testers in exploiting large language model AI solutions.☆19Updated 8 months ago
- Risks and targets for assessing LLMs & LLM vulnerabilities☆30Updated 8 months ago
- LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins☆25Updated 6 months ago
- ☆47Updated 6 months ago
- A collection of automated evaluators for assessing jailbreak attempts.☆102Updated this week
- LLM security and privacy☆44Updated 3 months ago
- LLM Self Defense: By Self Examination, LLMs know they are being tricked☆31Updated 8 months ago
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆110Updated 7 months ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆90Updated this week
- Adversarial Attacks on GPT-4 via Simple Random Search [Dec 2023]☆43Updated 9 months ago
- Privacy backdoors☆51Updated 9 months ago
- ☆70Updated 2 months ago
- ☆19Updated last year
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆108Updated 10 months ago
- [ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability☆131Updated last month
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆79Updated this week
- ☆12Updated last month
- ☆161Updated last year
- A fast + lightweight implementation of the GCG algorithm in PyTorch☆164Updated 3 weeks ago
- This repository provides implementation to formalize and benchmark Prompt Injection attacks and defenses☆167Updated last week
- Code used to run the platform for the LLM CTF colocated with SaTML 2024☆26Updated 10 months ago
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆39Updated 3 months ago
- The official repository of the paper "On the Exploitability of Instruction Tuning".☆58Updated 11 months ago
- ☆45Updated last month
- SecGPT: An execution isolation architecture for LLM-based systems☆60Updated last month
- Papers about red teaming LLMs and Multimodal models.☆91Updated 2 months ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆66Updated 11 months ago