CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on real-world vulnerability analysis tasks.
☆467May 18, 2026Updated last month
Alternatives and similar repositories for cybergym
Users that are interested in cybergym are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆99Jun 15, 2026Updated 2 weeks ago
- ☆10May 14, 2024Updated 2 years ago
- SecLLMHolmes is a generalized, fully automated, and scalable framework to systematically evaluate the performance (i.e., accuracy and rea…☆65May 4, 2025Updated last year
- Implementation and datasets for "Training Language Models to Generate Quality Code with Program Analysis Feedback"☆41Jul 21, 2025Updated 11 months ago
- Source code for LLMxCPG paper☆154Mar 26, 2026Updated 3 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Training Language Model Agents to Find Vulnerabilities with CTF-Dojo☆54Jan 10, 2026Updated 5 months ago
- Parsing-based Analyzer☆77Jun 8, 2025Updated last year
- A Unified Platform for Evaluating SAST Tools for Android☆20Mar 30, 2025Updated last year
- Ghidra decompiler in your browser☆115May 4, 2026Updated last month
- CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities☆245Jan 14, 2026Updated 5 months ago
- ☆26Jan 7, 2024Updated 2 years ago
- Simultaneous evaluation on both functionality and security of LLM-generated code.☆41Jun 18, 2026Updated last week
- Security Harness Engineering for Robust Program Analysis☆136Jan 23, 2026Updated 5 months ago
- [ICSE'24 Industry Challenge Track] "ReposVul: A Repository-Level High-Quality Vulnerability Dataset".☆107Nov 24, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A manually vetted dataset for security vulnerability detection in Java projects☆105Aug 12, 2025Updated 10 months ago
- ☆130Jul 14, 2024Updated last year
- List of Papers on Attack and Defense (AD) in AI Models☆27Mar 18, 2022Updated 4 years ago
- tool of llm-based indirect-call analyzer☆31Feb 18, 2025Updated last year
- Cyber-Zero: Training Cybersecurity Agents Without Runtime☆96Feb 13, 2026Updated 4 months ago
- 🥇 Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeati…☆73May 11, 2026Updated last month
- Resources for our ICSE'24 poster: Prompt-Enhanced Software Vulnerability Detection Using ChatGPT.☆25May 8, 2024Updated 2 years ago
- ☆25May 28, 2025Updated last year
- A Reproducible Benchmark of Recent Java Bugs☆51Aug 19, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆624Nov 25, 2025Updated 7 months ago
- Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.☆18Nov 21, 2025Updated 7 months ago
- An example vulnerable app that integrates an LLM☆27Apr 5, 2024Updated 2 years ago
- [CCS'24] An LLM-based, fully automated fuzzing tool for option combination testing.☆101Feb 10, 2026Updated 4 months ago
- XNU Image Fuzzer - iOS App for Fuzzing Images with Objective-C Code covering 15 CGCreateBitmap & CGColorSpace Functions working with Raw …☆41Jun 1, 2026Updated 3 weeks ago
- [NDSS 2025] "CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models"☆26Aug 20, 2025Updated 10 months ago
- MegaVul - The largest, high-quality, extensible, continuously updated, C/C++/Java vulnerability dataset☆150Jan 12, 2025Updated last year
- A SAST skill that gives AI coding agents structured vulnerability detection across 34 vulnerability classes.☆265Apr 7, 2026Updated 2 months ago
- ☆50Jan 14, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Repository for "SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques" publis…☆92Nov 4, 2023Updated 2 years ago
- ☆28Apr 28, 2023Updated 3 years ago
- ☆41Jan 13, 2023Updated 3 years ago
- ☆29Aug 31, 2025Updated 10 months ago
- docker env for ios research on a mac host☆27Jun 12, 2025Updated last year
- Reinforcement Learning for Repository-Level Code Completion☆44Jun 15, 2026Updated 2 weeks ago
- ☆28May 27, 2023Updated 3 years ago