NYU-LLM-CTF/NYU_CTF_Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NYU-LLM-CTF/NYU_CTF_Bench)

NYU-LLM-CTF / NYU_CTF_Bench

☆123

Alternatives and similar repositories for NYU_CTF_Bench

Users that are interested in NYU_CTF_Bench are comparing it to the libraries listed below

Sorting:

NYU-LLM-CTF / nyuctf_agents
View on GitHub
The D-CIPHER and NYU CTF baseline LLM Agents built for NYU CTF Bench
☆131Oct 25, 2025Updated 4 months ago
lucagioacchini / auto-pen-bench
View on GitHub
This repo contains the codes of the penetration test benchmark for Generative Agents presented in the paper "AutoPenBench: Benchmarking G…
☆65Oct 28, 2025Updated 4 months ago
ipa-lab / benchmark-privesc-linux
View on GitHub
A comprehensive local Linux Privilege-Escalation Benchmark
☆46Nov 7, 2025Updated 4 months ago
KHenryAegis / Pentest-R1
View on GitHub
The repository of Pentest-R1: Towards Autonomous Penetration Testing Reasoning Optimized via Two-Stage Reinforcement Learning.
☆29Sep 8, 2025Updated 5 months ago
Dizzy-K / AutoPT
View on GitHub
[IEEE T-IFS] AutoPT: How Far Are We from the Fully Automated Web Penetration Testing?
☆32Aug 18, 2025Updated 6 months ago
Lucas-TY / llm_Implicit_reference
View on GitHub
Official Implementation of implicit reference attack
☆11Oct 16, 2024Updated last year
Err0rCM / CTFd_with_CTFd-whale
View on GitHub
This repository is used to provide a reference for CTF dynamic target machine
☆14Mar 11, 2023Updated 2 years ago
isamu-isozaki / AI-Pentest-Benchmark
View on GitHub
The goal of this repo is to become a benchmark for pentesting
☆19Oct 25, 2024Updated last year
andyzorigin / cybench
View on GitHub
☆203Dec 13, 2025Updated 2 months ago
uiuc-kang-lab / cve-bench
View on GitHub
CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities
☆167Jan 14, 2026Updated last month
imethanlee / KnowPhish
View on GitHub
[USENIX Security 2024] Official Repository of 'KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-…
☆14Aug 6, 2025Updated 7 months ago
o-o-overflow / scoring-playground
View on GitHub
Tool to test different CTF scoring algorithms on real data
☆17May 3, 2021Updated 4 years ago
dodo47 / cyberML
View on GitHub
Machine learning on knowledge graphs for context-aware security monitoring (data and model)
☆18Mar 11, 2022Updated 3 years ago
TPs-ESIR-S9 / PcapFileAnalysis
View on GitHub
Malicious Network Traffic Analysis with AI
☆22Feb 1, 2024Updated 2 years ago
youngsecurity / pentest-agent-system
View on GitHub
The Pentest Agent System is an autonomous penetration testing framework built on the MITRE ATT&CK framework.
☆30Apr 16, 2025Updated 10 months ago
brootware / CTF-Writeups
View on GitHub
Write ups to the CTF problems online.
☆15Mar 17, 2022Updated 3 years ago
NYU-LLM-CTF / nyuctf_agents_craken
View on GitHub
☆31Jul 13, 2025Updated 7 months ago
ICL-ml4csec / SQIRL
View on GitHub
☆19Jun 27, 2023Updated 2 years ago
cnut1648 / Model-Fingerprint
View on GitHub
Fingerprint large language models
☆49Jul 11, 2024Updated last year
seclab-fudan / FirmRec
View on GitHub
Firmrec is a recurring vulnerability detector for embedded firmware.
☆51May 9, 2025Updated 9 months ago
n132 / CTF-Challenges
View on GitHub
CTF-PWN LEARNING MATERIALS
☆22Jun 25, 2024Updated last year
LIONS-EPFL / Charmer
View on GitHub
Revisiting Character-level Adversarial Attacks for Language Models, ICML 2024
☆19Feb 12, 2025Updated last year
jpmorganchase / CyberBench
View on GitHub
CyberBench: A Multi-Task Cyber LLM Benchmark
☆30Apr 29, 2025Updated 10 months ago
sefcom / ropbot
View on GitHub
A fast and powerful gadget finder and ROP chain generator. A research prototype for the ropbot paper accepted at NDSS'26.
☆45Jan 22, 2026Updated last month
keraJLi / synthetic-gymnax
View on GitHub
Drop-in environment replacements that make your RL algorithm train faster.
☆21Jun 19, 2024Updated last year
aielte-research / HackSynth
View on GitHub
LLM Agent and Evaluation Framework for Autonomous Penetration Testing
☆293Jun 24, 2025Updated 8 months ago
0xb0bb / ctf-challs
View on GitHub
A subset of CTF challenges I have made over the years.
☆18Aug 4, 2022Updated 3 years ago
OneOffTech / awesome-pdf
View on GitHub
A curated list of amazingly libraries, services and resources to work with PDF files
☆16Jan 28, 2026Updated last month
uiuc-kang-lab / InjecAgent
View on GitHub
☆118Jul 2, 2024Updated last year
wearetyomsmnv / Awesome-LLM-agent-Security
View on GitHub
All about llm-agents security,attack,vulnerabilities and how to do them for cybersecurity.
☆44Dec 28, 2025Updated 2 months ago
AI45Lab / ActorAttack
View on GitHub
☆122Feb 3, 2025Updated last year
ybai62868 / OpenCL_Xilinx-Intel_HeteroCL
View on GitHub
This is a repo which contains some details about how to use OpenCL backend (Xilinx/Intel).
☆25Oct 18, 2019Updated 6 years ago
seclab-fudan / RecurScan
View on GitHub
☆27Feb 19, 2024Updated 2 years ago
dstl / YAWNING-TITAN
View on GitHub
YAWNING TITAN is an abstract, graph based cyber-security simulation environment that supports the training of intelligent agents for auto…
☆66May 21, 2024Updated last year
DLVulDet / PrimeVul
View on GitHub
Repository for PrimeVul Vulnerability Detection Dataset
☆224Sep 7, 2024Updated last year
andreashappe / cochise
View on GitHub
Autonomous Assumed Breach Penetration-Testing Active Directory Networks
☆41Updated this week
Rookie143 / BadRobot
View on GitHub
This is the official repository for the ICLR 2025 accepted paper Badrobot: Manipulating Embodied LLMs in the Physical World.
☆41Jun 26, 2025Updated 8 months ago
fuzz-evaluator / guidelines
View on GitHub
☆70Mar 7, 2024Updated 2 years ago
olegnazarov / llm-fortress
View on GitHub
Enterprise AI Security Platform - Real-time firewall protection for LLM applications against prompt injection, data leakage, and function…
☆23Sep 14, 2025Updated 5 months ago