jpmorganchase / CyberBenchLinks
CyberBench: A Multi-Task Cyber LLM Benchmark
☆17Updated 2 months ago
Alternatives and similar repositories for CyberBench
Users that are interested in CyberBench are comparing it to the libraries listed below
Sorting:
- ☆48Updated 9 months ago
- A collection of agents that use Large Language Models (LLMs) to perform tasks common on our day to day jobs in cyber security.☆136Updated last year
- https://arxiv.org/abs/2412.02776☆59Updated 7 months ago
- ATLAS tactics, techniques, and case studies data☆76Updated 2 months ago
- using ML models for red teaming☆43Updated last year
- A productionized greedy coordinate gradient (GCG) attack tool for large language models (LLMs)☆122Updated 6 months ago
- Automated Safety Testing of Large Language Models☆16Updated 5 months ago
- Tree of Attacks (TAP) Jailbreaking Implementation☆111Updated last year
- The D-CIPHER and NYU CTF baseline LLM Agents built for NYU CTF Bench☆86Updated this week
- This is a dataset intended to train a LLM model for a completely CVE focused input and output.☆62Updated 3 weeks ago
- A collection of prompt injection mitigation techniques.☆23Updated last year
- This repository contains attack chains generated by Aurora that can be reproduced in virtual environments.☆15Updated last week
- CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…☆44Updated 3 weeks ago
- ☆53Updated 9 months ago
- ☆65Updated 5 months ago
- YAWNING TITAN is an abstract, graph based cyber-security simulation environment that supports the training of intelligent agents for auto…☆64Updated last year
- ☆41Updated this week
- [IJCAI 2024] Imperio is an LLM-powered backdoor attack. It allows the adversary to issue language-guided instructions to control the vict…☆42Updated 5 months ago
- Cybersecurity Intelligent Pentesting Helper for Ethical Researcher (CIPHER). Fine tuned LLM for penetration testing guidance based on wri…☆25Updated 6 months ago
- ☆121Updated last month
- Interactive, dynamic, and realistic LLM honeypots☆52Updated 4 months ago
- CS-Eval is a comprehensive evaluation suite for fundamental cybersecurity models or large language models' cybersecurity ability.☆43Updated 7 months ago
- LLM Honeypot: Leveraging Large Language Models as Advanced Interactive Honeypot Systems☆18Updated 3 months ago
- Code for shelLM tool☆55Updated 5 months ago
- Code snippets to reproduce MCP tool poisoning attacks.☆145Updated 3 months ago
- A comprehensive local Linux Privilege-Escalation Benchmark☆37Updated last month
- Secure Jupyter Notebooks and Experimentation Environment☆76Updated 5 months ago
- SecureBERT is a domain-specific language model to represent cybersecurity textual data.☆95Updated 11 months ago
- General research for Dreadnode☆23Updated last year
- A very simple open source implementation of Google's Project Naptime☆160Updated 3 months ago