[NeurIPS'24, Spotlight] CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence
☆86May 7, 2026Updated last month
Alternatives and similar repositories for cti-bench
Users that are interested in cti-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- AnnoCTR corpus for detection and linking of entities in cyber threat reports☆30Apr 12, 2024Updated 2 years ago
- CyberMetric dataset☆124May 27, 2026Updated 3 weeks ago
- A Novel and Modular Solution for Extracting All STIX Objects in CTI Reports☆33Aug 21, 2023Updated 2 years ago
- ☆10Jan 21, 2019Updated 7 years ago
- CyberBench: A Multi-Task Cyber LLM Benchmark☆34Apr 29, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆42Feb 18, 2026Updated 4 months ago
- MALOnt - an ontology for Malware Threat Intelligence.☆13Jul 8, 2021Updated 4 years ago
- RedSage: A Cybersecurity Generalist LLM (ICLR'26)☆50May 12, 2026Updated last month
- OWASP Ontology-driven Threat Modelling framework☆43Jul 11, 2023Updated 2 years ago
- ☆36Jan 27, 2026Updated 4 months ago
- Data for CyberSOCEval, an LLM benchmark by Meta & CrowdStrike☆22Sep 22, 2025Updated 8 months ago
- Vulnerability knowledge graph construction☆30Dec 24, 2022Updated 3 years ago
- TTPDrill focuses on developing automated and context-aware analytics of cyber threat intelligence to accurately learn attack patterns (TT…☆27May 29, 2020Updated 6 years ago
- Collection of videos of Raids on Cybercriminals☆22Mar 19, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Extracts IoCs, TTPs and the relationships between them. Outputs a STIX 2.1 bundle.☆81May 23, 2026Updated 3 weeks ago
- ☆118Apr 3, 2024Updated 2 years ago
- Intel Retrieval Augmented Generation (RAG) Utilities☆90Jan 29, 2024Updated 2 years ago
- An overview of LLMs for cybersecurity.☆1,666Updated this week
- This unique variation on Thinking Claude maps Claude's thought process steps to unicode and forces Claude to think in unicode, potentiall…☆17Feb 24, 2025Updated last year
- Replication package for the paper "Automatic Mapping of Unstructured Cyber Threat Intelligence: An Experimental Study" published at the I…☆60Aug 29, 2022Updated 3 years ago
- SecLLMHolmes is a generalized, fully automated, and scalable framework to systematically evaluate the performance (i.e., accuracy and rea…☆65May 4, 2025Updated last year
- Detection of malicious prompts used to exploit large language models (LLMs) by leveraging supervised machine learning classifiers.☆21Oct 30, 2024Updated last year
- ☆264Apr 22, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Research Artifact for HPCA'24 Paper: *Modeling, Derivation, and Automated Analysis of Branch Predictor Security Vulnerabilities*.☆11Oct 30, 2025Updated 7 months ago
- CS-Eval is a comprehensive evaluation suite for fundamental cybersecurity models or large language models' cybersecurity ability.☆62Nov 27, 2024Updated last year
- ⚠️ ARCHIVED**: This repository is no longer actively maintained. All Sigma rules are now managed and available in SIEM Rules☆13Mar 19, 2026Updated 2 months ago
- Turn a supported list of filetypes (e.g. .docx) into a markdown structured text file. Also optionally defangs indicators and extract text…☆12Jun 1, 2026Updated 2 weeks ago
- A fun POC that is built to understand AI security agents.☆36Oct 30, 2025Updated 7 months ago
- Agent Zero (agent-zero.ai) extensions for ethical penetration testing☆21Sep 10, 2025Updated 9 months ago
- ☆10Jul 9, 2020Updated 5 years ago
- [NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage☆16Sep 2, 2025Updated 9 months ago
- ☆108Jun 2, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [USENIX Security '24] An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities agai…☆59Mar 22, 2025Updated last year
- MCP easy installer is a robust mcp server with tools to search, install, configure, repair and uninstall MCP servers☆17Apr 19, 2025Updated last year
- NHS Hack Day website☆14Apr 25, 2026Updated last month
- Learning from Negative samples for Biomedical Generative Entity Linking☆18May 25, 2025Updated last year
- Get all cve corresponding to a specific keyword or a list of keywords from the mitre database (https://cve.mitre.org/)☆17Aug 20, 2022Updated 3 years ago
- Cyber attack attribution is the process of attempting to trace back a piece of code or malware to a perpetrator of a cyberattack. As cybe…☆15Jan 15, 2021Updated 5 years ago
- Hands-on MCP security lab: 10 real incidents reproduced with vulnerable/secure MCP servers, pytest regressions, and Claude/Cursor battle-…☆89Dec 3, 2025Updated 6 months ago