google-deepmind/dangerous-capability-evaluations

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/google-deepmind/dangerous-capability-evaluations)

google-deepmind / dangerous-capability-evaluations

☆73

Alternatives and similar repositories for dangerous-capability-evaluations

Users that are interested in dangerous-capability-evaluations are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

redwoodresearch / interp
View on GitHub
Redwood Research's transformer interpretability tools
☆15Apr 15, 2022Updated 4 years ago
METR / public-tasks
View on GitHub
☆129Jun 10, 2026Updated last month
rgreenblatt / model_organism_public
View on GitHub
☆15Jun 17, 2025Updated last year
UKGovernmentBEIS / control-arena
View on GitHub
ControlArena is a collection of settings, model organisms and protocols - for running control experiments.
☆213Updated this week
moohax / malwareGPT
View on GitHub
☆18May 6, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kyegomez / dev-swarm
View on GitHub
A swarm of LLM agents that will help you test, document, and productionize your code!
☆20Updated this week
meridianlabs-ai / inspect_viz
View on GitHub
Data visualization for Inspect AI large language model evalutions.
☆21Jul 15, 2026Updated last week
METR / vivaria
View on GitHub
Vivaria is METR's tool for running evaluations and conducting agent elicitation research.
☆140May 18, 2026Updated 2 months ago
aisa-group / decomposing-eval-awareness
View on GitHub
Decomposing and measuring evaluation awareness in existing benchmarks and our proposed EvalAwareBench.
☆19Jun 1, 2026Updated last month
neelnanda-io / Neuroscope
View on GitHub
Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons
☆14Feb 13, 2023Updated 3 years ago
rgreenblatt / control-evaluations
View on GitHub
☆25May 25, 2024Updated 2 years ago
HazyResearch / dd-genomics
View on GitHub
The Genomics DeepDive project
☆11Jun 20, 2016Updated 10 years ago
timothee-chauvin / eyeballvul
View on GitHub
future-proof vulnerability detection benchmark, based on CVEs in open-source repos
☆70Updated this week
EleutherAI / deep-ignorance
View on GitHub
☆20Jan 7, 2026Updated 6 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
EleutherAI / elk
View on GitHub
Keeping language models honest by directly eliciting knowledge encoded in their activations.
☆221Updated this week
METR / task-standard
View on GitHub
METR Task Standard
☆184Feb 3, 2025Updated last year
jvmncs / safe-grid-agents
View on GitHub
Training (hopefully) safe agents in gridworlds
☆26May 12, 2019Updated 7 years ago
UlisseMini / procgen-tools
View on GitHub
Tools for running experiments on RL agents in procgen environments
☆20Apr 5, 2024Updated 2 years ago
redwoodresearch / Text-Steganography-Benchmark
View on GitHub
Code for Preventing Language Models From Hiding Their Reasoning, which evaluates defenses against LLM steganography.
☆25Jan 26, 2024Updated 2 years ago
ai4er-cdt / geograph
View on GitHub
GeoGraph provides a tool for analysing habitat fragmentation and related problems in landscape ecology. GeoGraph builds a geospatially re…
☆41Apr 12, 2024Updated 2 years ago
callummcdougall / ARENA_2.0
View on GitHub
Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.
☆247Aug 11, 2025Updated 11 months ago
scaleapi / propensity-evaluation
View on GitHub
open Source code for propensity evaluation
☆19Apr 25, 2026Updated 3 months ago
liaoq / pnas2019
View on GitHub
☆11Nov 27, 2019Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
GiantSeaweed / DECREE
View on GitHub
Official repository for CVPR'23 paper: Detecting Backdoors in Pre-trained Encoders
☆39Sep 25, 2023Updated 2 years ago
n1kn4x / timing-analysis
View on GitHub
Python3 library for sophisticated timing attacks using Gaussian Mixture Model.
☆22Apr 10, 2022Updated 4 years ago
WuTheFWasThat / EigenSeeClearlyNow
View on GitHub
Linear algebra visualizations
☆13Jul 10, 2021Updated 5 years ago
niplav / iqisa
View on GitHub
Collection of compatible forecasting datasets
☆12Feb 28, 2024Updated 2 years ago
JacobPfau / fillerTokens
View on GitHub
☆76Apr 27, 2024Updated 2 years ago
sfeucht / footprints
View on GitHub
https://footprints.baulab.info
☆17Oct 4, 2024Updated last year
alexander-turner / attainable-utility-preservation
View on GitHub
☆11Jun 2, 2021Updated 5 years ago
HumanCompatibleAI / deep-rlsp
View on GitHub
Code accompanying "Learning What To Do by Simulating the Past", ICLR 2021.
☆27May 4, 2021Updated 5 years ago
BishopFox / burpcage
View on GitHub
☆10May 25, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
google-deepmind / cartesian-frames
View on GitHub
A formalisation of Cartesian Frames, a perspective on embedded agency, in the HOL theorem prover.
☆22Dec 20, 2021Updated 4 years ago
RLG-Leiden / edugym
View on GitHub
☆15Sep 22, 2023Updated 2 years ago
gt-big-data / solar-forecasting
View on GitHub
An application that displays a map and graphs showing solar irradiance forecasts in solar farms in Georgia using data from the National S…
☆10Oct 15, 2021Updated 4 years ago
safety-research / inoculation-prompting
View on GitHub
☆15Oct 13, 2025Updated 9 months ago
METR / Measuring-Early-2025-AI-on-Exp-OSS-Devs
View on GitHub
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity: https://metr.org/blog/2025-07-10-early-2025-ai-e…
☆16Feb 23, 2026Updated 5 months ago
epoch-research / MirrorCode
View on GitHub
Public repository for MirrorCode
☆37Jul 18, 2026Updated last week
felixbinder / introspection_self_prediction
View on GitHub
Code for experiments on self-prediction as a way to measure introspection in LLMs
☆16Dec 10, 2024Updated last year