ServiceNow/DoomArena

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ServiceNow/DoomArena)

ServiceNow / DoomArena

DoomArena is a Framework for Testing AI Agents Against Evolving Security Threats

☆62

Alternatives and similar repositories for DoomArena

Users that are interested in DoomArena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

The-AI-Alliance / cube-standard
View on GitHub
Standardize benchmark wrapping so the community can wrap various otherwise-incompatible benchmarks uniformly and use them everywhere.
☆52Jul 17, 2026Updated last week
The-AI-Alliance / cube-harness
View on GitHub
Drive OSS standards and tools for data curation and evaluation creation for state of the art AI agents
☆54Jul 17, 2026Updated last week
ServiceNow / agent-poirot
View on GitHub
☆17May 6, 2025Updated last year
ServiceNow / TapeAgents
View on GitHub
TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle
☆318Dec 16, 2025Updated 7 months ago
ServiceNow / WorkArena
View on GitHub
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
☆261Apr 25, 2026Updated 3 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ServiceNow / AgentLab
View on GitHub
AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and re…
☆606Jul 17, 2026Updated last week
apple / ml-vlsu
View on GitHub
☆14Nov 18, 2025Updated 8 months ago
The-AI-Alliance / GEO-Bench-2
View on GitHub
Code for GEO-Bench V2 datasets. Made in collaboration with TUM, IBM and ServiceNow.
☆30Updated this week
fjzzq2002 / WeightWatch
View on GitHub
Official Repository of Paper "Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs"
☆15Sep 25, 2025Updated 10 months ago
kaijiezhu11 / MELON
View on GitHub
[ICML'25] MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents
☆37Jul 31, 2025Updated 11 months ago
invariantlabs-ai / mcp-injection-experiments
View on GitHub
Code snippets to reproduce MCP tool poisoning attacks.
☆199Apr 10, 2025Updated last year
microsoft / llmail-inject-challenge
View on GitHub
Code for the API, workload execution, and agents underlying the LLMail-Inject Adpative Prompt Injection Challenge
☆25Apr 9, 2026Updated 3 months ago
tml-epfl / os-harm
View on GitHub
OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents [NeurIPS 2025 Spotlight]
☆69Sep 18, 2025Updated 10 months ago
denoland / chromium_buildtools
View on GitHub
forked from chromium to use git submodules instead of gclient
☆16Dec 18, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mitre-atlas / atlas-data
View on GitHub
ATLAS tactics, techniques, and case studies data
☆159Jun 30, 2026Updated 3 weeks ago
Privatris / AgentLeak
View on GitHub
AgentLeak: Open benchmark for privacy leakage in LLM agents — 7 channels, multi-agent, multi-framework.
☆25Jul 1, 2026Updated 3 weeks ago
hoon9405 / DescEmb
View on GitHub
DescEmb - Unifying Heterogenous Electronic Health Records Systems via Text-Based Code Embedding
☆22Apr 29, 2025Updated last year
PKU-Alignment / ProgressGym
View on GitHub
Alignment with a millennium of moral progress. Spotlight@NeurIPS 2024 Track on Datasets and Benchmarks.
☆25Mar 30, 2025Updated last year
tim-hua-01 / steering-eval-awareness-public
View on GitHub
☆17Mar 16, 2026Updated 4 months ago
dreadnode / AIRTBench-Code
View on GitHub
Code Repository for: AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models
☆102Apr 26, 2026Updated 3 months ago
AI-secure / AdvAgent
View on GitHub
☆25May 28, 2025Updated last year
SkafteNicki / cuda_expm
View on GitHub
Matrix exponential in cuda for pytorch and tensorflow
☆17Nov 26, 2018Updated 7 years ago
Jiuzhouh / Uncertainty-Aware-Language-Agent
View on GitHub
This is the official repo for Towards Uncertainty-Aware Language Agent.
☆31Aug 15, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
XanderJC / medkit-learn
View on GitHub
The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation (NeurIPS 2021) by Alex J. Chan, Ioana Bica, Alihan Huyuk…
☆29Jan 5, 2022Updated 4 years ago
OrderAndCh4oS / phonetics-transliterator
View on GitHub
Convert bodies of text to IPA translations
☆12May 2, 2023Updated 3 years ago
sergeykhbr / gpu3d
View on GitHub
Learn and build GPU RTL from scratch
☆22Aug 1, 2025Updated 11 months ago
invariantlabs-ai / invariant-gateway
View on GitHub
LLM proxy to observe and debug what your AI agents are doing.
☆78Nov 6, 2025Updated 8 months ago
thu-coai / Agent-SafetyBench
View on GitHub
☆149Aug 11, 2025Updated 11 months ago
OSU-NLP-Group / RedTeamCUA
View on GitHub
[ICLR'26 Oral] RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
☆57Feb 9, 2026Updated 5 months ago
mpomonis / krx
View on GitHub
kR^X: Comprehensive Kernel Protection Against Just-In-Time Code Reuse
☆13Aug 21, 2017Updated 8 years ago
mcubelab / pdproc
View on GitHub
Scripts for processing and rendering the MIT push dataset
☆19Nov 6, 2019Updated 6 years ago
epoch-research / training-cost-trends
View on GitHub
☆27Apr 1, 2026Updated 3 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
netsys-edinburgh / SpotLight
View on GitHub
☆13Nov 18, 2024Updated last year
lakeraai / dsec-gandalf
View on GitHub
☆24Mar 18, 2025Updated last year
wspr-ncsu / mininode
View on GitHub
Mininode is a CLI tool to reduce the attack surface of the Node.js applications by using static analysis.
☆21Apr 11, 2023Updated 3 years ago
pr0toshi / rateLimit
View on GitHub
Limits asset outflows from contracts within customisable timeframes
☆11May 7, 2022Updated 4 years ago
BerkeleyHCI / HCI-Resources
View on GitHub
An informal Wiki for HCI Research Info
☆14Jan 15, 2025Updated last year
Saluana / EZAI-Web-Scraper
View on GitHub
An API that allows you to scrape blog posts and articles and get a list of notes or a summary back.
☆10Mar 31, 2023Updated 3 years ago
edgeimpulse / example-custom-ml-block-keras
View on GitHub
Custom Keras ML block example for Edge Impulse
☆12Updated this week