invariantlabs-ai / explorerLinks
A better way of testing, inspecting, and analyzing AI Agent traces.
☆40Updated last month
Alternatives and similar repositories for explorer
Users that are interested in explorer are comparing it to the libraries listed below
Sorting:
- Sphynx Hallucination Induction☆53Updated 6 months ago
- Let Claude control a web browser on your machine.☆36Updated 2 months ago
- Inference-time scaling for LLMs-as-a-judge.☆276Updated last month
- Guardrails for secure and robust agent development☆334Updated 3 weeks ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆88Updated 10 months ago
- Letting Claude Code develop his own MCP tools :)☆121Updated 5 months ago
- A framework for optimizing DSPy programs with RL☆150Updated last week
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆84Updated 5 months ago
- Agent computer interface for AI software engineer.☆101Updated this week
- A Text-Based Environment for Interactive Debugging☆254Updated last week
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆54Updated 3 months ago
- ☆71Updated 10 months ago
- Red-Teaming Language Models with DSPy☆208Updated 6 months ago
- ☆25Updated 2 weeks ago
- ☆52Updated 4 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆95Updated 4 months ago
- 🤖 Headless IDE for AI agents☆200Updated 4 months ago
- A framework for building large-scale, deterministic, interactive workflows with a fault-tolerant, conversational UX☆23Updated this week
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆68Updated last year
- Routing on Random Forest (RoRF)☆191Updated 11 months ago
- ToolFuzz is a fuzzing framework designed to test your LLM Agent tools.☆25Updated last month
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆54Updated 5 months ago
- A system that tries to resolve all issues on a github repo with OpenHands.☆112Updated 9 months ago
- Prompt design in Python☆62Updated 8 months ago
- ☆97Updated 11 months ago
- Verbosity control for AI agents☆65Updated last year
- Small, simple agent task environments for training and evaluation☆18Updated 9 months ago
- SCIPE is a powerful tool for evaluating and diagnosing LLM (Large Language Model) graphs or chains.☆25Updated 9 months ago
- Code interpreter support for o1☆32Updated 11 months ago
- Thoughtful Lightning AI Assistant - Dual-engine system with DeepSeek reasoning and Groq inference, featuring Gradio UI, secure API manage…☆20Updated 7 months ago