anthropic-experimental / automated-auditingLinks
Prompts used in the Automated Auditing Blog Post
☆78Updated 2 weeks ago
Alternatives and similar repositories for automated-auditing
Users that are interested in automated-auditing are comparing it to the libraries listed below
Sorting:
- Public repository containing METR's DVC pipeline for eval data analysis☆86Updated 4 months ago
- ☆377Updated last month
- Inference-time scaling for LLMs-as-a-judge.☆267Updated 3 weeks ago
- A framework for optimizing DSPy programs with RL☆96Updated this week
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆433Updated last week
- A Text-Based Environment for Interactive Debugging☆250Updated this week
- Open Source Replication of Anthropic's Alignment Faking Paper☆47Updated 4 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆222Updated this week
- Testing baseline LLMs performance across various models☆293Updated 2 weeks ago
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆95Updated 2 weeks ago
- A better way of testing, inspecting, and analyzing AI Agent traces.☆39Updated last month
- Routing on Random Forest (RoRF)☆187Updated 10 months ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆103Updated last week
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆23Updated 4 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆81Updated last week
- Train your own SOTA deductive reasoning model☆103Updated 5 months ago
- ☆73Updated 5 months ago
- ☆146Updated 7 months ago
- ⚖️ Awesome LLM Judges ⚖️☆108Updated 3 months ago
- ☆130Updated 4 months ago
- Build hours code to share.☆427Updated last week
- The State Of The Art, intelligence☆148Updated this week
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆72Updated 4 months ago
- Red-Teaming Language Models with DSPy☆203Updated 5 months ago
- ☆58Updated this week
- ☆72Updated this week
- ☆182Updated 5 months ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆289Updated last week
- ☆95Updated 3 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆53Updated 3 months ago