isamu-isozaki / AI-Pentest-BenchmarkLinks
The goal of this repo is to become a benchmark for pentesting
☆16Updated last year
Alternatives and similar repositories for AI-Pentest-Benchmark
Users that are interested in AI-Pentest-Benchmark are comparing it to the libraries listed below
Sorting:
- ☆165Updated 4 months ago
- ☆123Updated last year
- CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…☆83Updated 3 weeks ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆328Updated 2 weeks ago
- SecLLMHolmes is a generalized, fully automated, and scalable framework to systematically evaluate the performance (i.e., accuracy and rea…☆60Updated 5 months ago
- The D-CIPHER and NYU CTF baseline LLM Agents built for NYU CTF Bench☆99Updated this week
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆15Updated 7 months ago
- Repository for "SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques" publis…☆80Updated last year
- An autonomous LLM-agent for large-scale, repository-level code auditing☆250Updated last week
- CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities☆106Updated last week
- The automated prompt injection framework for LLM-integrated applications.☆235Updated last year
- ☆94Updated last month
- Repository for PrimeVul Vulnerability Detection Dataset☆188Updated last year
- An Execution Isolation Architecture for LLM-Based Agentic Systems☆97Updated 9 months ago
- ☆48Updated last year
- This repo contains the codes of the penetration test benchmark for Generative Agents presented in the paper "AutoPenBench: Benchmarking G…☆45Updated 2 weeks ago
- ☆52Updated last year
- The official repository of "GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics". The paper will appear in the IE…☆47Updated 2 years ago
- CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software☆294Updated last year
- [USENIX Security '24] An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities agai…☆52Updated 7 months ago
- DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection (RAID 2023) https://surrealyz.github.io/…☆162Updated last year
- Official repo for FSE'24 paper "CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking"☆16Updated 7 months ago
- A comprehensive local Linux Privilege-Escalation Benchmark☆41Updated last month
- TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…☆69Updated last month
- ☆38Updated 4 months ago
- EvoEval: Evolving Coding Benchmarks via LLM☆79Updated last year
- A Novel Benchmark evaluating the Deep Capability of Vulnerability Detection with Large Language Models☆29Updated 6 months ago
- TensorFlow API analysis tool and malicious model detection tool☆36Updated 5 months ago
- Finetuning large language models (LLMs) for vulnerability detection☆52Updated 6 months ago
- ☆69Updated last week