NYU-LLM-CTF / NYU_CTF_BenchLinks
☆59Updated 2 months ago
Alternatives and similar repositories for NYU_CTF_Bench
Users that are interested in NYU_CTF_Bench are comparing it to the libraries listed below
Sorting:
- The D-CIPHER and NYU CTF baseline LLM Agents built for NYU CTF Bench☆86Updated this week
- CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities☆62Updated 3 weeks ago
- The repository of VulnBot: Autonomous Penetration Testing for A Multi-Agent Collaborative Framework.☆80Updated 3 months ago
- A comprehensive local Linux Privilege-Escalation Benchmark☆37Updated last month
- The automated prompt injection framework for LLM-integrated applications.☆216Updated 10 months ago
- 🧠 LLMFuzzer - Fuzzing Framework for Large Language Models 🧠 LLMFuzzer is the first open-source fuzzing framework specifically designed …☆286Updated last year
- A curated list of awesome resources about LLM supply chain security (including papers, security reports and CVEs)☆78Updated 5 months ago
- [CCS'24] An LLM-based, fully automated fuzzing tool for option combination testing.☆84Updated 2 months ago
- CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…☆44Updated 3 weeks ago
- This is a dataset intended to train a LLM model for a completely CVE focused input and output.☆62Updated 3 weeks ago
- ☆65Updated 5 months ago
- PenGym: Pentesting Training Framework for Reinforcement Learning Agents☆35Updated 6 months ago
- CTF challenges designed and implemented in machine learning applications☆158Updated 10 months ago
- This repo contains the codes of the penetration test benchmark for Generative Agents presented in the paper "AutoPenBench: Benchmarking G…☆35Updated this week
- An Execution Isolation Architecture for LLM-Based Agentic Systems☆83Updated 5 months ago
- A curated list of research resources in automated vulnerability detection (AVD)☆33Updated 7 months ago
- SecLLMHolmes is a generalized, fully automated, and scalable framework to systematically evaluate the performance (i.e., accuracy and rea…☆58Updated 2 months ago
- An autonomous LLM-agent for large-scale, repository-level code auditing☆159Updated this week
- [USENIX Security '24] An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities agai…☆47Updated 3 months ago
- CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software☆263Updated 11 months ago
- ☆121Updated last month
- Large Language Model guided Protocol Fuzzing (NDSS'24)☆346Updated 2 weeks ago
- MegaVul - The largest, high-quality, extensible, continuously updated, C/C++/Java vulnerability dataset☆107Updated 6 months ago
- VulZoo: A Comprehensive Vulnerability Intelligence Dataset (ASE 2024 Demo)☆54Updated 3 months ago
- ☆67Updated last year
- This repository provides a benchmark for prompt Injection attacks and defenses☆245Updated last month
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆504Updated 9 months ago
- ☆45Updated 9 months ago
- CS-Eval is a comprehensive evaluation suite for fundamental cybersecurity models or large language models' cybersecurity ability.☆43Updated 7 months ago
- ☆119Updated last year