ZiyueWang25 / llm-security-challengeLinks
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
☆13Updated 2 years ago
Alternatives and similar repositories for llm-security-challenge
Users that are interested in llm-security-challenge are comparing it to the libraries listed below
Sorting:
- Whispers in the Machine: Confidentiality in Agentic Systems☆41Updated last month
- LLM security and privacy☆53Updated last year
- [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆62Updated last month
- This project explores training data extraction attacks on the LLaMa 7B, GPT-2XL, and GPT-2-IMDB models to discover memorized content usin…☆15Updated 2 years ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]☆373Updated 11 months ago
- This repository provides a benchmark for prompt injection attacks and defenses in LLMs☆373Updated 2 months ago
- AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks☆63Updated 7 months ago
- Implementation of BEAST adversarial attack for language models (ICML 2024)☆92Updated last year
- CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…☆101Updated 3 months ago
- ☆112Updated last month
- Code used to run the platform for the LLM CTF colocated with SaTML 2024☆28Updated last year
- LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins☆29Updated last year
- Papers about red teaming LLMs and Multimodal models.☆159Updated 7 months ago
- Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers☆66Updated last year
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆56Updated last year
- An Execution Isolation Architecture for LLM-Based Agentic Systems☆101Updated 11 months ago
- ☆103Updated last year
- [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.☆179Updated 9 months ago
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆67Updated last year
- [ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability☆176Updated last year
- ☆114Updated 8 months ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆112Updated last year
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆184Updated 9 months ago
- ☆75Updated last year
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆560Updated last year
- TAP: An automated jailbreaking method for black-box LLMs☆214Updated last year
- Security Attacks on LLM-based Code Completion Tools (AAAI 2025)☆21Updated last week
- ☆190Updated 2 years ago
- ☆22Updated 2 years ago
- Implementation for "RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content"☆22Updated last year