invariantlabs-ai / explorerLinks

A better way of testing, inspecting, and analyzing AI Agent traces.

☆40

Alternatives and similar repositories for explorer

Users that are interested in explorer are comparing it to the libraries listed below

Sorting:

invariantlabs-ai / playwright-computer-use
Let Claude control a web browser on your machine.
☆39Updated 4 months ago
haizelabs / sphynx
Sphynx Hallucination Induction
☆53Updated 8 months ago
zhudotexe / redel
ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)
☆88Updated last month
haizelabs / dspy-redteam
Red-Teaming Language Models with DSPy
☆221Updated 8 months ago
haizelabs / get-haized
A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.
☆97Updated 6 months ago
Ziems / arbor
A framework for optimizing DSPy programs with RL
☆208Updated this week
invariantlabs-ai / invariant
Guardrails for secure and robust agent development
☆354Updated 2 months ago
Columbia-NLP-Lab / PAPILLON
Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles
☆58Updated 5 months ago
zbambergerNLP / strategic-debate-tot
A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments
☆90Updated 3 weeks ago
haizelabs / verdict
Inference-time scaling for LLMs-as-a-judge.
☆303Updated 3 weeks ago
hwchase17 / langfuzz
☆73Updated last year
radiantlogicinc / fastworkflow
A framework for building large-scale, deterministic, interactive workflows with a fault-tolerant, conversational UX
☆40Updated this week
egozverev / Should-It-Be-Executed-Or-Processed
Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.
☆57Updated 7 months ago
BBischof / yapping
Verbosity control for AI agents
☆65Updated last year
AnswerDotAI / GeminiSave
☆52Updated 6 months ago
microsoft / debug-gym
A Text-Based Environment for Interactive Debugging
☆272Updated this week
brendanhogan / picoDeepResearch
☆68Updated 5 months ago
ibm-granite / granite-guardian
The Granite Guardian models are designed to detect risks in prompts and responses.
☆119Updated 2 weeks ago
eth-sri / ToolFuzz
ToolFuzz is a fuzzing framework designed to test your LLM Agent tools.
☆30Updated 3 months ago
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆56Updated 11 months ago
yoheinakajima / autofinetune
auto fine tune of models with synthetic data
☆75Updated last year
jtanningbed / mcp-ag2-example
a simple example demonstrating MCP + ag2 (autogen) integration
☆41Updated 3 months ago
swyxio / openlangmem
☆47Updated last year
FanaHOVA / openai-o1-code-interpreter
Code interpreter support for o1
☆32Updated last year
hide-org / hide
🤖 Headless IDE for AI agents
☆200Updated 2 weeks ago
alexzhang13 / rlm
Super basic implementation (gist-like) of RLMs with REPL environments.
☆204Updated last week
facebookresearch / ZeroSumEval
A framework for pitting LLMs against each other in an evolving library of games ⚔
☆33Updated 6 months ago
All-Hands-AI / openhands-resolver
A system that tries to resolve all issues on a github repo with OpenHands.
☆114Updated 11 months ago
All-Hands-AI / openhands-aci
Agent computer interface for AI software engineer.
☆111Updated last month
Archelunch / vibe-dspy
☆47Updated 2 months ago