lucagioacchini / auto-pen-benchLinks

This repo contains the codes of the penetration test benchmark for Generative Agents presented in the paper "AutoPenBench: Benchmarking Generative Agents for Penetration Testing". It contains also the instructions to install, develop and test new vulnerable containers to include in the benchmark.

☆35

Alternatives and similar repositories for auto-pen-bench

Users that are interested in auto-pen-bench are comparing it to the libraries listed below

Sorting:

llm-platform-security / SecGPT
An Execution Isolation Architecture for LLM-Based Agentic Systems
☆86Updated 6 months ago
KHenryAegis / VulnBot
The repository of VulnBot: Autonomous Penetration Testing for A Multi-Agent Collaborative Framework.
☆86Updated 4 months ago
tmylla / HackMentor
The repository of paper "HackMentor: Fine-Tuning Large Language Models for Cybersecurity".
☆126Updated last year
NYU-LLM-CTF / nyuctf_agents
The D-CIPHER and NYU CTF baseline LLM Agents built for NYU CTF Bench
☆91Updated last week
Dizzy-K / AutoPT
Benchmark data from the article "AutoPT: How Far Are We from End2End Automated Web Penetration Testing?"
☆17Updated 9 months ago
NYU-LLM-CTF / NYU_CTF_Bench
☆63Updated 3 months ago
cybermetric / CyberMetric
CyberMetric dataset
☆93Updated 7 months ago
sherdencooper / GPTFuzz
Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
☆512Updated 10 months ago
andyzorigin / cybench
☆130Updated last month
CS-EVAL / CS-Eval
CS-Eval is a comprehensive evaluation suite for fundamental cybersecurity models or large language models' cybersecurity ability.
☆43Updated 8 months ago
nbshenxm / pentest-agent
PentestAgent is a novel LLM-driven penetration testing framework to automate intelligence gathering, vulnerability analysis, and exploita…
☆58Updated last week
liu673 / Awesome-LLM4Security
This project aims to consolidate and share high-quality resources and tools across the cybersecurity domain.
☆229Updated last week
CSJianYang / SEevenLLM
☆35Updated last year
johnhalloran321 / mcpSafetyScanner
MCPSafetyScanner - Automated MCP safety auditing and remediation using Agents. More info: https://www.arxiv.org/abs/2504.03767
☆101Updated 3 months ago
uiuc-kang-lab / cve-bench
CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities
☆73Updated 2 weeks ago
sherdencooper / PromptFuzz
☆26Updated 9 months ago
morpheuslord / CVE-llm_dataset
This is a dataset intended to train a LLM model for a completely CVE focused input and output.
☆63Updated last month
LLMSecurity / HouYi
The automated prompt injection framework for LLM-integrated applications.
☆221Updated 10 months ago
ZacharyZcR / SecGPT
A Test Project for a Network Security-oriented LLM Tool Emulating AutoGPT
☆288Updated last year
cyb3rlab / PenGym
PenGym: Pentesting Training Framework for Reinforcement Learning Agents
☆38Updated 7 months ago
sunblaze-ucb / cybergym
CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…
☆49Updated last week
tuhh-softsec / LLMSecEval
☆47Updated 10 months ago
invariantlabs-ai / invariant
Guardrails for secure and robust agent development
☆327Updated last week
ai4cloudops / SecLLMHolmes
SecLLMHolmes is a generalized, fully automated, and scalable framework to systematically evaluate the performance (i.e., accuracy and rea…
☆57Updated 3 months ago
huhusmang / Awesome-LLMs-for-Vulnerability-Detection
Awesome Large Language Models for Vulnerability Detection
☆207Updated this week
uiuc-kang-lab / InjecAgent
☆70Updated last year
aielte-research / HackSynth
LLM Agent and Evaluation Framework for Autonomous Penetration Testing
☆197Updated last month
invariantlabs-ai / mcp-injection-experiments
Code snippets to reproduce MCP tool poisoning attacks.
☆164Updated 3 months ago
ddzipp / AutoAudit
AutoAudit—— the LLM for Cyber Security 网络安全大语言模型
☆343Updated 5 months ago
isamu-isozaki / AI-Pentest-Benchmark
The goal of this repo is to become a benchmark for pentesting
☆13Updated 9 months ago