invariantlabs-ai / invariant-gateway
LLM proxy to observe and debug what your AI agents are doing.
☆20Updated this week
Alternatives and similar repositories for invariant-gateway
Users that are interested in invariant-gateway are comparing it to the libraries listed below
Sorting:
- A better way of testing, inspecting, and analyzing AI Agent traces.☆35Updated this week
- Guardrails for secure and robust agent development☆252Updated this week
- ☆93Updated 8 months ago
- Red-Teaming Language Models with DSPy☆192Updated 3 months ago
- ☆72Updated 6 months ago
- ☆65Updated 2 months ago
- ☆57Updated last month
- ToolFuzz is a fuzzing framework designed to test your LLM Agent tools.☆17Updated 2 months ago
- A plugin-based gateway that orchestrates other MCPs and allows developers to build upon it enterprise-grade agents.☆157Updated 3 weeks ago
- ☆77Updated 6 months ago
- CodeSage: Code Representation Learning At Scale (ICLR 2024)☆107Updated 6 months ago
- Code for ScribeAgent paper☆57Updated 2 months ago
- EvoEval: Evolving Coding Benchmarks via LLM☆70Updated last year
- Simple examples using Argilla tools to build AI☆52Updated 5 months ago
- Comprehensive metrics, insights, and visualization for Phidata and Crew AI applications☆25Updated last month
- r2e: turn any github repository into a programming agent environment☆119Updated 3 weeks ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆69Updated 6 months ago
- The Granite Guardian models are designed to detect risks in prompts and responses.☆81Updated last month
- Let Claude control a web browser on your machine.☆28Updated 2 months ago
- ☆150Updated 2 months ago
- ⚖️ Awesome LLM Judges ⚖️☆97Updated 2 weeks ago
- ☆100Updated 2 months ago
- ☆100Updated this week
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆78Updated 2 months ago
- Ranking LLMs on agentic tasks☆132Updated 3 weeks ago
- A Text-Based Environment for Interactive Debugging☆199Updated this week
- DevQualityEval: An evaluation benchmark 📈 and framework to compare and evolve the quality of code generation of LLMs.☆172Updated this week
- A framework for high-fidelity retrieval augmented generation in industrial knowledge bases. Integrates jargon identification, context rec…☆30Updated 9 months ago
- CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments☆55Updated 2 months ago
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆32Updated last year