amazon-science / Cyber-ZeroLinks
Cyber-Zero: Training Cybersecurity Agents Without Runtime
☆36Updated last week
Alternatives and similar repositories for Cyber-Zero
Users that are interested in Cyber-Zero are comparing it to the libraries listed below
Sorting:
- ☆17Updated last year
 - ☆48Updated last year
 - CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on…☆84Updated 3 weeks ago
 - ☆165Updated 4 months ago
 - 🥇 Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeati…☆62Updated 2 months ago
 - ☆85Updated last year
 - CodeGuard+: Constrained Decoding for Secure Code Generation☆15Updated last year
 - Repository for "SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques" publis…☆80Updated last year
 - [LREC-COLING'24] HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization☆38Updated 7 months ago
 - [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆52Updated 3 months ago
 - [ICSE'25] Aligning the Objective of LLM-based Program Repair☆20Updated 7 months ago
 - Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆15Updated 7 months ago
 - A Benchmark for Evaluating Safety and Trustworthiness in Web Agents for Enterprise Scenarios☆16Updated 5 months ago
 - ☆123Updated last year
 - ☆17Updated 2 years ago
 - ☆95Updated last month
 - ☆25Updated last year
 - A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.☆87Updated last year
 - Artifact repository for the paper "Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code", In P…☆50Updated 6 months ago
 - [COLING25] CodeJudge Eval: Can Large Language Models be Good Judges in Code Understanding?☆13Updated 11 months ago
 - EvoEval: Evolving Coding Benchmarks via LLM☆79Updated last year
 - CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities☆106Updated last week
 - ☆68Updated last year
 - [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.☆172Updated 7 months ago
 - AIxCC: automated vulnerability repair via LLMs, search, and static analysis☆12Updated last year
 - Implementation of BEAST adversarial attack for language models (ICML 2024)☆91Updated last year
 - Official repo for FSE'24 paper "CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking"☆16Updated 7 months ago
 - [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆162Updated 6 months ago
 - The official code for ``An Engorgio Prompt Makes Large Language Model Babble on''☆15Updated 2 months ago
 - ☆52Updated last year