invariantlabs-ai / explorerLinks
A better way of testing, inspecting, and analyzing AI Agent traces.
☆40Updated this week
Alternatives and similar repositories for explorer
Users that are interested in explorer are comparing it to the libraries listed below
Sorting:
- Let Claude control a web browser on your machine.☆39Updated 7 months ago
- Sphynx Hallucination Induction☆53Updated 11 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆90Updated last month
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆100Updated 9 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆96Updated 3 months ago
- ☆76Updated last year
- Guardrails for secure and robust agent development☆378Updated 5 months ago
- Harness used to benchmark aider against SWE Bench benchmarks☆78Updated last year
- Red-Teaming Language Models with DSPy☆249Updated 11 months ago
- Code interpreter support for o1☆31Updated last year
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆61Updated 8 months ago
- ☆55Updated 9 months ago
- Aider's refactoring benchmark exercises based on popular python repos☆78Updated last year
- A mcp server that uses the Osmosis-Apply-1.7B model to apply code merges☆53Updated 6 months ago
- Routing on Random Forest (RoRF)☆238Updated last year
- ☆68Updated 7 months ago
- SCIPE is a powerful tool for evaluating and diagnosing LLM (Large Language Model) graphs or chains.☆25Updated last year
- Verbosity control for AI agents☆65Updated last year
- A Text-Based Environment for Interactive Debugging☆288Updated this week
- Inference-time scaling for LLMs-as-a-judge.☆324Updated 2 months ago
- Small, simple agent task environments for training and evaluation☆19Updated last year
- A framework for optimizing DSPy programs with RL☆304Updated this week
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆82Updated 11 months ago
- Contains the prompts we use to talk to various LLMs for different utilities inside the editor☆83Updated last year
- A framework for building large-scale, deterministic, interactive workflows with a fault-tolerant, conversational UX☆43Updated 3 weeks ago
- Agent computer interface for AI software engineer.☆114Updated last month
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆72Updated 2 months ago
- auto fine tune of models with synthetic data☆78Updated last year
- ☆37Updated 5 months ago
- Test Generation for Prompts☆147Updated this week