ZiyueWang25 / llm-security-challengeLinks
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
β13Updated last year
Alternatives and similar repositories for llm-security-challenge
Users that are interested in llm-security-challenge are comparing it to the libraries listed below
Sorting:
- π€π‘οΈπππ Tiny package designed to support red teams and penetration testers in exploiting large language model AI solutions.β23Updated last year
- Whispers in the Machine: Confidentiality in Agentic Systemsβ37Updated 2 weeks ago
- LLM security and privacyβ49Updated 7 months ago
- PAL: Proxy-Guided Black-Box Attack on Large Language Modelsβ51Updated 9 months ago
- β63Updated 11 months ago
- LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Pluginsβ25Updated 10 months ago
- Risks and targets for assessing LLMs & LLM vulnerabilitiesβ30Updated last year
- An Execution Isolation Architecture for LLM-Based Agentic Systemsβ80Updated 4 months ago
- AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacksβ47Updated 2 weeks ago
- β71Updated 6 months ago
- This repository provides a benchmark for prompt Injection attacks and defensesβ216Updated this week
- β109Updated 2 weeks ago
- Code to break Llama Guardβ31Updated last year
- Agent Security Bench (ASB)β81Updated last month
- A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.β69Updated last year
- [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.β158Updated 2 months ago
- TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides aβ¦β56Updated 2 months ago
- β43Updated 8 months ago
- Repository for "SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques" publisβ¦β71Updated last year
- The D-CIPHER and NYU CTF baseline LLM Agents built for NYU CTF Benchβ77Updated last month
- Fine-tuning base models to build robust task-specific modelsβ30Updated last year
- Code to conduct an embedding attack on LLMsβ25Updated 4 months ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"β126Updated last month
- Code used to run the platform for the LLM CTF colocated with SaTML 2024β26Updated last year
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.β113Updated 11 months ago
- β114Updated 10 months ago
- β29Updated 9 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".β105Updated last year
- Implementation of BEAST adversarial attack for language models (ICML 2024)β87Updated last year
- β26Updated 9 months ago