eval-protocol / python-sdkLinks
The official Python SDK for Eval Protocol
☆87Updated last week
Alternatives and similar repositories for python-sdk
Users that are interested in python-sdk are comparing it to the libraries listed below
Sorting:
- Provider-agnostic, open-source evaluation infrastructure for language models☆690Updated last week
- ☆616Updated this week
- A fully customizable and self-hosted sandboxing solution for AI agent code execution and computer use. It features out-of-the-box support…☆721Updated 6 months ago
- Serverless Posttraining☆63Updated last week
- Open source codebase for Scale Agentex☆238Updated this week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆768Updated this week
- Inference-time scaling for LLMs-as-a-judge.☆316Updated last month
- An MCP Server that's also an MCP Client. Useful for letting Claude develop and test MCPs without needing to reset the application.☆123Updated 9 months ago
- A framework for optimizing DSPy programs with RL☆298Updated last month
- Easiest way to give context to LLMs; Attachments has the ambition to be the general funnel for any files to be transformed into images+te…☆324Updated 3 months ago
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆391Updated this week
- Routing on Random Forest (RoRF)☆233Updated last year
- The CLI for GPUs☆126Updated 2 weeks ago
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆270Updated 2 months ago
- Letting Claude Code develop his own MCP tools :)☆122Updated 9 months ago
- Super basic implementation (gist-like) of RLMs with REPL environments.☆280Updated 2 months ago
- The fastest, lightest, and easiest-to-integrate AI gateway on the market. Fully open-sourced.☆481Updated 3 weeks ago
- Claude Deep Research config for Claude Code.☆222Updated 9 months ago
- An in-depth book and reference on building agentic systems like Claude Code☆254Updated 6 months ago
- ☆220Updated last week
- ACP is the Agent Control Plane - a distributed agent scheduler optimized for simplicity, clarity, and control. It is designed for outer-l…☆273Updated 5 months ago
- ☆234Updated 5 months ago
- Training-Ready RL Environments + Evals☆190Updated last week
- MCP Community Working Group repository☆65Updated 3 weeks ago
- Together Open Deep Research☆355Updated 8 months ago
- A Node.js package and GitHub Action for evaluating MCP (Model Context Protocol) tool implementations using LLM-based scoring. This helps …☆122Updated 5 months ago
- 🤖 Headless IDE for AI agents☆200Updated 2 months ago
- Observability and runtime visualization for JS/TS/Python code with zero code change☆133Updated 6 months ago
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆377Updated 3 months ago
- This repo tracks the opened and merged PRs by the top SWE coding agents by OpenAI, GitHub, and others. Updates every 3 hours.☆296Updated this week