microsoft / agdebuggerLinks
☆63Updated 2 weeks ago
Alternatives and similar repositories for agdebugger
Users that are interested in agdebugger are comparing it to the libraries listed below
Sorting:
- ☆234Updated 4 months ago
- Ranking LLMs on agentic tasks☆199Updated last week
- Official Repo for CRMArena and CRMArena-Pro☆125Updated 3 weeks ago
- The Granite Guardian models are designed to detect risks in prompts and responses.☆121Updated last month
- Official page for ICLR 2025 paper "Sufficient Context: A New Lens on Retrieval Augmented Generation Systems"☆60Updated 4 months ago
- ☆102Updated last year
- DSPY on action with OpenSource LLMs.☆98Updated last year
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆60Updated 6 months ago
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆171Updated last week
- A Lightweight Library for AI Observability☆251Updated 9 months ago
- Client interface to Cleanlab Studio☆32Updated 9 months ago
- A framework for fine-tuning retrieval-augmented generation (RAG) systems.☆136Updated this week
- MCP-based Agent Deep Evaluation System☆138Updated 2 months ago
- Tutorial for building LLM router☆236Updated last year
- A better way of testing, inspecting, and analyzing AI Agent traces.☆40Updated last month
- Training setup for Langchain's Open Deep Research☆72Updated 3 months ago
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆138Updated 3 months ago
- Routing on Random Forest (RoRF)☆222Updated last year
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆499Updated 9 months ago
- An Automatic Prompt Optimization Framework for Large Language Models☆137Updated 3 months ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆102Updated 3 months ago
- Test Generation for Prompts☆143Updated last week
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆123Updated last month
- ☆79Updated last month
- A clean, modular SDK for building AI agents with OpenHands V1.☆199Updated this week
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆113Updated 7 months ago
- A Text-Based Environment for Interactive Debugging☆277Updated this week
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆81Updated 9 months ago
- Automatic Prompt Optimization☆48Updated last year
- Dynamic Metadata based RAG Framework☆78Updated last year