ZiyueWang25 / llm-security-challengeLinks
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
☆13Updated last year
Alternatives and similar repositories for llm-security-challenge
Users that are interested in llm-security-challenge are comparing it to the libraries listed below
Sorting:
- Whispers in the Machine: Confidentiality in Agentic Systems☆39Updated 2 months ago
- LLM security and privacy☆49Updated 9 months ago
- Risks and targets for assessing LLMs & LLM vulnerabilities☆32Updated last year
- LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins☆27Updated last year
- Security Attacks on LLM-based Code Completion Tools (AAAI 2025)☆20Updated 3 months ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]☆326Updated 6 months ago
- This repository provides a benchmark for prompt Injection attacks and defenses☆255Updated 3 weeks ago
- CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…☆49Updated last week
- ☆70Updated last year
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to a…☆401Updated last year
- [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.☆165Updated 4 months ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆230Updated last week
- AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks☆51Updated 2 months ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆137Updated 4 months ago
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆54Updated 11 months ago
- ☆82Updated 8 months ago
- TAP: An automated jailbreaking method for black-box LLMs☆182Updated 8 months ago
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆513Updated 10 months ago
- Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers☆56Updated 11 months ago
- Code to break Llama Guard☆31Updated last year
- Code to conduct an embedding attack on LLMs☆27Updated 7 months ago
- TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…☆62Updated 5 months ago
- ☆591Updated last month
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆114Updated last year
- Implementation for "RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content"☆23Updated last year
- ☆60Updated 7 months ago
- An Execution Isolation Architecture for LLM-Based Agentic Systems☆86Updated 6 months ago
- Agent Security Bench (ASB)☆102Updated last month
- Automated Safety Testing of Large Language Models☆16Updated 6 months ago
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆52Updated 9 months ago