tuhh-softsec / LLMSecEvalLinks
☆55Updated last year
Alternatives and similar repositories for LLMSecEval
Users that are interested in LLMSecEval are comparing it to the libraries listed below
Sorting:
- Repository for "SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques" publis…☆83Updated 2 years ago
- ☆127Updated last year
- SecLLMHolmes is a generalized, fully automated, and scalable framework to systematically evaluate the performance (i.e., accuracy and rea…☆63Updated 9 months ago
- 🪐 A Database of Existing Security Vulnerabilities Patches to Enable Evaluation of Techniques (single-commit; multi-language)☆42Updated 9 months ago
- [USENIX Security '24] An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities agai…☆56Updated 10 months ago
- Agent Security Bench (ASB)☆177Updated 3 months ago
- CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities☆142Updated 3 weeks ago
- A curated list of awesome resources about LLM supply chain security (including papers, security reports and CVEs)☆94Updated last year
- DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection (RAID 2023) https://surrealyz.github.io/…☆172Updated last year
- CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software☆312Updated last year
- Repository for PrimeVul Vulnerability Detection Dataset☆216Updated last year
- 🔥🔥🔥 Detecting hidden backdoors in Large Language Models with only black-box access☆52Updated 8 months ago
- ☆29Updated last year
- Automated Benchmarking of LLM Agents on Real-World Software Security Tasks [NeurIPS 2025]☆55Updated last week
- ☆50Updated last year
- A Novel Benchmark evaluating the Deep Capability of Vulnerability Detection with Large Language Models☆32Updated 9 months ago
- AIBugHunter: A Practical Tool for Predicting, Classifying and Repairing Software Vulnerabilities☆43Updated last year
- An implementation of the ACL 2024 Findings paper "Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tu…☆75Updated 3 months ago
- [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆64Updated 2 months ago
- The automated prompt injection framework for LLM-integrated applications.☆251Updated last year
- ☆78Updated last year
- ☠️ Ground-truth dataset for vulnerability prediction (known research datasets and data sources included such as NVD, CVE Details and OSV)…☆103Updated 2 years ago
- VulRepair: A T5-Based Automated Software Vulnerability Repair☆84Updated 8 months ago
- ☆43Updated last year
- [USENIX Security'24] Official repository of "Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise a…☆113Updated last year
- An autonomous LLM-agent for large-scale, repository-level code auditing☆322Updated 2 months ago
- ☆31Updated last year
- Vul4J: A Dataset of Reproducible Java Vulnerabilities☆119Updated 5 months ago
- CS-Eval is a comprehensive evaluation suite for fundamental cybersecurity models or large language models' cybersecurity ability.☆58Updated last year
- For our ISSTA23 paper "How Effective are Neural Networks for Fixing Security Vulnerabilities?" by Yi Wu, Nan Jiang, Hung Viet Pham, Thiba…☆41Updated 2 years ago