microsoft / agdebuggerLinks
☆67Updated last week
Alternatives and similar repositories for agdebugger
Users that are interested in agdebugger are comparing it to the libraries listed below
Sorting:
- Official page for ICLR 2025 paper "Sufficient Context: A New Lens on Retrieval Augmented Generation Systems"☆63Updated 6 months ago
- ☆237Updated 2 months ago
- A framework for fine-tuning retrieval-augmented generation (RAG) systems.☆139Updated last week
- Official Repo for CRMArena and CRMArena-Pro☆132Updated 2 months ago
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆174Updated last week
- ☆80Updated 3 months ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆103Updated 5 months ago
- The Granite Guardian models are designed to detect risks in prompts and responses.☆128Updated 3 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆260Updated last week
- A clean, modular SDK for building AI agents with OpenHands V1.☆459Updated this week
- Tutorial for building LLM router☆242Updated last year
- Ranking LLMs on agentic tasks☆210Updated 2 months ago
- MCP-based Agent Deep Evaluation System☆142Updated 4 months ago
- A method for steering llms to better follow instructions☆76Updated 5 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆61Updated 8 months ago
- ☆106Updated last year
- Beating the GAIA benchmark with Transformers Agents. 🚀☆144Updated 11 months ago
- Client interface to Cleanlab Studio☆32Updated 11 months ago
- Benchmark and optimize LLM inference across frameworks with ease☆158Updated 4 months ago
- Rank LLMs, RAG systems, and prompts using automated head-to-head evaluation☆108Updated last year
- ☆87Updated last year
- DSPY on action with OpenSource LLMs.☆102Updated last year
- Simple examples using Argilla tools to build AI☆57Updated last year
- ☆39Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆114Updated 9 months ago
- A Lightweight Library for AI Observability☆255Updated 11 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆128Updated 11 months ago
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆141Updated 5 months ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆82Updated 11 months ago
- [NAACL2025] LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications☆144Updated 6 months ago