☆65Feb 20, 2026Updated 2 weeks ago
Alternatives and similar repositories for dangerous-capability-evaluations
Users that are interested in dangerous-capability-evaluations are comparing it to the libraries listed below
Sorting:
- A swarm of LLM agents that will help you test, document, and productionize your code!☆16Feb 16, 2026Updated 2 weeks ago
- ☆10May 25, 2023Updated 2 years ago
- An Inspect extension for agentic cyber evaluations☆22Feb 24, 2026Updated last week
- The Genomics DeepDive project☆11Jun 20, 2016Updated 9 years ago
- Explore, Establish, Exploit: Red Teaming Language Models from Scratch☆13Jun 21, 2023Updated 2 years ago
- A utility to inspect, validate, sign and verify machine learning model files.☆66Feb 5, 2025Updated last year
- Advanced SQLMap command builder with an intuitive cheatsheet UI. Works locally in your browser as a single HTML file (no data sent anywhe…☆32Jul 6, 2025Updated 8 months ago
- Redwood Research's transformer interpretability tools☆15Apr 15, 2022Updated 3 years ago
- A Kubernetes sandbox environment for use with inspect_ai☆27Feb 26, 2026Updated last week
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆134Feb 15, 2026Updated 2 weeks ago
- ☆120Jan 19, 2026Updated last month
- ☆25Jan 8, 2025Updated last year
- Small, simple agent task environments for training and evaluation☆19Nov 1, 2024Updated last year
- Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"☆27Jun 4, 2024Updated last year
- ☆26Jul 11, 2022Updated 3 years ago
- Code for Preventing Language Models From Hiding Their Reasoning, which evaluates defenses against LLM steganography.☆25Jan 26, 2024Updated 2 years ago
- Training (hopefully) safe agents in gridworlds☆25May 12, 2019Updated 6 years ago
- Gantry is a CLI that streamlines running experiments in Beaker☆32Updated this week
- ReconPro is a specialized Google dorking tool designed for cybersecurity professionals and bug bounty hunters.☆44Feb 23, 2026Updated last week
- METR Task Standard☆177Feb 3, 2025Updated last year
- Collection of evals for Inspect AI☆393Updated this week
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …☆60Oct 11, 2024Updated last year
- This tool allows local LLM usage that can automate tasks without human interventention. The agent can call itself recursively and work on…☆20May 5, 2025Updated 10 months ago
- Make it easy to automatically and uniformly measure the behavior of many AI Systems.☆26Oct 2, 2024Updated last year
- Inspect: A framework for large language model evaluations☆1,800Updated this week
- ChatGPT Participates in a Computer Science Exam (2023)☆31Mar 21, 2023Updated 2 years ago
- Miscellaneous resources for Quantum Collective Knowledge☆32Jun 6, 2024Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Jun 19, 2024Updated last year
- Quantum Gate Language (QGL) is a domain specific language embedded in python for specifying quantum gate sequences.☆33Dec 17, 2025Updated 2 months ago
- Next-Toggle is just a simple plug and use, theme toggle button with multiple light and dark themes.☆11May 9, 2024Updated last year
- code for COLING paper "A Hybrid Model of Classification and Generation for Spatial Relation Extraction"☆10Oct 20, 2022Updated 3 years ago
- ☆10Feb 9, 2026Updated 3 weeks ago
- Notebooks and other course materials for Emory QTM 340 (Fall 2022)☆12Dec 13, 2022Updated 3 years ago
- ☆74Apr 27, 2024Updated last year
- Public repository containing METR's DVC pipeline for eval data analysis☆224Feb 13, 2026Updated 3 weeks ago
- ☆147Jan 17, 2025Updated last year
- Ground Penetrating Radar data processing and classification to detect buried objects in the ground☆14Jul 7, 2023Updated 2 years ago
- A library for probing Stockfish's NNUEs. The code for reading parameters and forward propagation is taken from Stockfish☆12Nov 18, 2025Updated 3 months ago
- Here, I provided the solution for exercises of IBM Quantum Challenge 2020☆10Oct 27, 2020Updated 5 years ago