invariantlabs-ai / invariant
Guardrails for secure and robust agent development
☆243Updated 3 weeks ago
Alternatives and similar repositories for invariant:
Users that are interested in invariant are comparing it to the libraries listed below
- Verdict is a library for scaling judge-time compute.☆202Updated last week
- A better way of testing, inspecting, and analyzing AI Agent traces.☆35Updated this week
- Red-Teaming Language Models with DSPy☆188Updated 2 months ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆152Updated this week
- ☆99Updated 2 months ago
- A repository of Language Model Vulnerabilities and Exposures (LVEs).☆109Updated last year
- ☆404Updated last week
- Enhancing AI Software Engineering with Repository-level Code Graph☆160Updated last month
- ⚖️ Awesome LLM Judges ⚖️☆94Updated last week
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆90Updated last week
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆170Updated last week
- ☆73Updated last week
- Sphynx Hallucination Induction☆54Updated 3 months ago
- ☆72Updated 6 months ago
- LLM proxy to observe and debug what your AI agents are doing.☆17Updated last week
- Python SDK for running evaluations on LLM generated responses☆278Updated last week
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆264Updated this week
- Contains the prompts we use to talk to various LLMs for different utilities inside the editor☆76Updated last year
- Collection of evals for Inspect AI☆127Updated this week
- The fastest Trust Layer for AI Agents☆132Updated 2 months ago
- Make your GenAI Apps Safe & Secure Test & harden your system prompt☆470Updated 6 months ago
- Commit0: Library Generation from Scratch☆144Updated last month
- 🤖 Headless IDE for AI agents☆186Updated 2 weeks ago
- ☆43Updated 9 months ago
- Let Claude control a web browser on your machine.☆26Updated 2 months ago
- r2e: turn any github repository into a programming agent environment☆116Updated 2 weeks ago
- CodeSage: Code Representation Learning At Scale (ICLR 2024)☆106Updated 6 months ago
- Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.☆169Updated last month
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆104Updated 3 weeks ago
- ☆123Updated last month