safety-research / petriLinks
An alignment auditing agent capable of quickly exploring alignment hypothesis
β804Updated last week
Alternatives and similar repositories for petri
Users that are interested in petri are comparing it to the libraries listed below
Sorting:
- bloom - evaluate any behavior immediately Β πΈπ±β1,107Updated this week
- β550Updated 7 months ago
- Evolve your language agent with Agentic Context Engineering (ACE)β507Updated 2 months ago
- Prompts used in the Automated Auditing Blog Postβ134Updated 5 months ago
- β916Updated last week
- Together Open Deep Researchβ354Updated 9 months ago
- Provider-agnostic, open-source evaluation infrastructure for language modelsβ709Updated 3 weeks ago
- A Tree Search Library with Flexible API for LLM Inference-Time Scalingβ512Updated last month
- A multi-agent LLM system for detecting and resolving cognitive dissonance.β271Updated 3 months ago
- π¨ NeMo Data Designer: A general library for generating high-quality synthetic data from scratch or based on seed data.β634Updated this week
- Real-Time Detection of Hallucinated Entities in Long-Form Generationβ274Updated 2 months ago
- β259Updated last month
- Super basic implementation (gist-like) of RLMs with REPL environments.β435Updated last week
- General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.β1,230Updated this week
- β236Updated last month
- Inference-time scaling for LLMs-as-a-judge.β325Updated 2 months ago
- A Text-Based Environment for Interactive Debuggingβ289Updated this week
- Verbalized Sampling, a training-free prompting strategy to mitigate mode collapse in LLMs by requesting responses with probabilities. Achβ¦β661Updated 2 weeks ago
- Open-source versioning, tracing, and annotation tooling.β211Updated 2 months ago
- Context Engineering Course with DSPyβ211Updated 5 months ago
- This repository allows reproduction of Poetiq's record-breaking submission to the ARC-AGI-1 and ARC-AGI-2 benchmarks.β1,140Updated last month
- The lightweight framework for building agentsβ258Updated this week
- Agent File (.af): An open file format for serializing stateful AI agents with persistent memory and behavior. Share, checkpoint, and versβ¦β983Updated last month
- A framework for optimizing DSPy programs with RLβ304Updated last week
- This repository contains the toolkit for replicating results from our technical report.β192Updated 4 months ago
- Verifiers for LLM Reinforcement Learningβ80Updated 4 months ago
- RAG evaluation without the need for "golden answers"β333Updated last month
- An open-source tool for LLM prompt optimization.β746Updated last week
- An agent benchmark with tasks in a simulated software company.β622Updated 2 months ago
- CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, β¦β637Updated last week