An alignment auditing agent capable of quickly exploring alignment hypothesis
β1,233Jun 13, 2026Updated this week
Alternatives and similar repositories for inspect_petri
Users that are interested in inspect_petri are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- bloom - evaluate any behavior immediately Β πΈπ±β1,353May 7, 2026Updated last month
- β47Jul 4, 2025Updated 11 months ago
- β623Jun 19, 2025Updated 11 months ago
- James' cookbook of evaluations and finetuning experimentsβ28Feb 19, 2026Updated 3 months ago
- Inspect: A framework for large language model evaluationsβ2,188Updated this week
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Open-sourced evaluation suite from the Monitoring Monitorability paperβ79Apr 22, 2026Updated last month
- Prompts used in the Automated Auditing Blog Postβ157Jul 24, 2025Updated 10 months ago
- A python sdk for LLM finetuning and inference on runpod infrastructureβ30May 12, 2026Updated last month
- Repository for "Training Language Models To Explain Their Own Computations"β22Dec 22, 2025Updated 5 months ago
- Open Source Replication of Anthropic's Alignment Faking Paperβ58Apr 4, 2025Updated last year
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.β138May 18, 2026Updated 3 weeks ago
- Collection of evals for Inspect AIβ529Updated this week
- β26Jun 22, 2025Updated 11 months ago
- β25May 25, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Public repository containing METR's DVC pipeline for eval data analysisβ291Mar 6, 2026Updated 3 months ago
- β28Nov 11, 2025Updated 7 months ago
- A library for training crosscodersβ17May 28, 2025Updated last year
- β11Jun 2, 2021Updated 5 years ago
- Official Code for our paper: "Language Models Learn to Mislead Humans via RLHF""β20Oct 11, 2024Updated last year
- β54May 9, 2025Updated last year
- A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.β74Apr 15, 2026Updated 2 months ago
- β42Jul 6, 2025Updated 11 months ago
- This was designed for interp researchers who want to do research on or with interp agents to give quality of life improvements and fix β¦β147Feb 8, 2026Updated 4 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Python Server for C3 AI app. A project that brings the power of Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) withβ¦β24Jan 7, 2024Updated 2 years ago
- β154Sep 29, 2025Updated 8 months ago
- A Chrome extension that allows you to export your Claude.ai conversations in various formats (JSON, Markdown, Plain Text) with support foβ¦β83Apr 29, 2026Updated last month
- A fuzzy file picker in a tmux popup for selecting files with terminal-based AI coding assistantsβ48Apr 26, 2026Updated last month
- β49May 17, 2026Updated 3 weeks ago
- Tools for optimizing steering vectors in LLMs.β22Apr 10, 2025Updated last year
- Residual Quantization Autoencoder, used for interpreting LLMsβ14Jan 1, 2025Updated last year
- Independent robustness evaluation of Improving Alignment and Robustness with Short Circuitingβ17Apr 15, 2025Updated last year
- β17Jul 9, 2025Updated 11 months ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- β14Jul 12, 2024Updated last year
- β26Sep 3, 2025Updated 9 months ago
- A Python-based security assessment tool for continuous automated security scanning and monitoring of domains.β13Apr 4, 2025Updated last year
- llama4_trip_planning_agentβ13Apr 5, 2025Updated last year
- β1,131Updated this week
- β24Updated this week
- Sparsify transformers with SAEs and transcodersβ725Jun 8, 2026Updated last week