eval-protocol / python-sdkLinks
The official Python SDK for Eval Protocol
☆87Updated this week
Alternatives and similar repositories for python-sdk
Users that are interested in python-sdk are comparing it to the libraries listed below
Sorting:
- Provider-agnostic, open-source evaluation infrastructure for language models☆661Updated last week
- An interface library for RL post training with environments.☆753Updated this week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆756Updated this week
- A fully customizable and self-hosted sandboxing solution for AI agent code execution and computer use. It features out-of-the-box support…☆663Updated 5 months ago
- 🚀 MassGen is an open-source multi-agent scaling system that runs in your terminal, autonomously orchestrating frontier models and agents…☆613Updated last week
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆269Updated last month
- ☆233Updated 5 months ago
- OSS RL environment + evals toolkit☆202Updated this week
- Benchmark and optimize LLM inference across frameworks with ease☆138Updated 2 months ago
- The fastest, lightest, and easiest-to-integrate AI gateway on the market. Fully open-sourced.☆468Updated last week
- Optimize prompts, code, and more with AI-powered Reflective Text Evolution☆1,641Updated last week
- An alignment auditing agent capable of quickly exploring alignment hypothesis☆690Updated this week
- Context Engineering Course with DSPy☆202Updated 4 months ago
- Routing on Random Forest (RoRF)☆222Updated last year
- Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol☆369Updated 3 months ago
- Training-Ready RL Environments + Evals☆177Updated last week
- Async RL Training at Scale☆780Updated last week
- Super basic implementation (gist-like) of RLMs with REPL environments.☆255Updated last month
- A Node.js package and GitHub Action for evaluating MCP (Model Context Protocol) tool implementations using LLM-based scoring. This helps …☆120Updated 5 months ago
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆376Updated this week
- Observability and runtime visualization for JS/TS/Python code with zero code change☆134Updated 5 months ago
- Inference-time scaling for LLMs-as-a-judge.☆312Updated 3 weeks ago
- 🧬 The Huxley-Gödel Machine☆301Updated this week
- A cache for AI agents to learn and replay complex behaviors.☆756Updated 5 months ago
- Open collaboration infrastructure that enables communication, coordination, trust and payments for The Internet of Agents.☆201Updated this week
- An MCP Server that's also an MCP Client. Useful for letting Claude develop and test MCPs without needing to reset the application.☆123Updated 8 months ago
- Managed Agent Posttraining☆60Updated last week
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆492Updated last week
- Helping you select an AI agent framework☆406Updated last week
- A framework for optimizing DSPy programs with RL☆285Updated last week