isamu-isozaki / AI-Pentest-BenchmarkLinks

The goal of this repo is to become a benchmark for pentesting

☆13

Alternatives and similar repositories for AI-Pentest-Benchmark

Users that are interested in AI-Pentest-Benchmark are comparing it to the libraries listed below

Sorting:

sunblaze-ucb / cybergym
CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…
☆49Updated last week
andyzorigin / cybench
☆130Updated last month
NYU-LLM-CTF / nyuctf_agents
The D-CIPHER and NYU CTF baseline LLM Agents built for NYU CTF Bench
☆91Updated last week
eth-sri / sven
☆120Updated last year
PurCL / RepoAudit
An autonomous LLM-agent for large-scale, repository-level code auditing
☆192Updated 3 weeks ago
SunLab-GMU / GraphSPD
The official repository of "GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics". The paper will appear in the IE…
☆47Updated 2 years ago
ai4cloudops / SecLLMHolmes
SecLLMHolmes is a generalized, fully automated, and scalable framework to systematically evaluate the performance (i.e., accuracy and rea…
☆57Updated 3 months ago
lucagioacchini / auto-pen-bench
This repo contains the codes of the penetration test benchmark for Generative Agents presented in the paper "AutoPenBench: Benchmarking G…
☆35Updated last month
ZJU-SEC / TensorAbuse
TensorFlow API analysis tool and malicious model detection tool
☆33Updated 2 months ago
DLVulDet / PrimeVul
Repository for PrimeVul Vulnerability Detection Dataset
☆168Updated 11 months ago
NYU-LLM-CTF / NYU_CTF_Bench
☆65Updated 3 months ago
PurCL / ProSec
Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"
☆15Updated 4 months ago
llm-platform-security / SecGPT
An Execution Isolation Architecture for LLM-Based Agentic Systems
☆86Updated 6 months ago
s2e-lab / SecurityEval
Repository for "SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques" publis…
☆74Updated last year
ise-uiuc / WhiteFox
WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models (OOPSLA 2024)
☆65Updated this week
iSEngLab / LLM4VulFix
[2023 TDSC] Pre-trained Model-based Automated Software Vulnerability Repair: How Far are We?
☆25Updated 2 years ago
secureIT-project / CVEfixes
CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software
☆267Updated last year
ise-uiuc / KNighter
Automatic checker synthesis for system-level static analysis
☆31Updated last week
SEC-bench / SEC-bench
☆22Updated last month
ethz-spylab / agentdojo
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
☆230Updated this week
for-just-we / CodeAnalyzer
tool of llm-based indirect-call analyzer
☆30Updated 5 months ago
sola-st / RepairAgent
RepairAgent is an autonomous LLM-based agent for software repair.
☆59Updated 2 weeks ago
Sweetaroo / VulDetectBench
A Novel Benchmark evaluating the Deep Capability of Vulnerability Detection with Large Language Models
☆26Updated 3 months ago
NASP-THU / CSEBenchmark
The official repository of the paper "The Digital Cybersecurity Expert: How Far Have We Come?" presented in IEEE S&P 2025
☆20Updated 2 months ago
chengpeng-wang / LLMDFA
LLMDFA: Analyzing Dataflow in Code with Large Language Models (NeurIPS 2024)
☆140Updated 2 months ago
Jamrot / ChatGPT-Vulnerability-Management
☆19Updated 11 months ago
tuhh-softsec / LLMSecEval
☆47Updated 10 months ago
Vul-LMGNN / vul-LMGGNN
Code for the paper - Source Code Vulnerability Detection: Combining Code Language Models and Code Property Graph
☆78Updated last year
Hustcw / VulBench
This is a benchmark for evaluating the vulnerability discovery ability of automated approaches including Large Language Models (LLMs), de…
☆69Updated 8 months ago
ShenaoW / awesome-llm-supply-chain-security
A curated list of awesome resources about LLM supply chain security (including papers, security reports and CVEs)
☆82Updated 6 months ago