eval-protocol / python-sdkLinks
The official Python SDK for Eval Protocol
☆60Updated this week
Alternatives and similar repositories for python-sdk
Users that are interested in python-sdk are comparing it to the libraries listed below
Sorting:
- vscode extension to convert computationally intensive pytorch kernels to triton☆22Updated 11 months ago
- Provider-agnostic, open-source evaluation infrastructure for language models☆539Updated this week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆697Updated this week
- Optimize prompts, code, and more with AI-powered Reflective Text Evolution☆773Updated this week
- ☆224Updated 3 months ago
- A CLI for GPUs☆112Updated last week
- The Orchestration Layer for AI agents. Connect your models, tools, and data into a smart interface to create agentic apps that can think,…☆225Updated last week
- An MCP Server that's also an MCP Client. Useful for letting Claude develop and test MCPs without needing to reset the application.☆124Updated 6 months ago
- Inference-time scaling for LLMs-as-a-judge.☆299Updated 3 weeks ago
- Async RL Training at Scale☆650Updated this week
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆329Updated 2 weeks ago
- OSS RL environment + evals toolkit☆181Updated this week
- ⚡ Bhumi – The fastest AI inference client for Python, built with Rust for unmatched speed, efficiency, and scalability 🚀☆61Updated this week
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆321Updated this week
- Tzafon-WayPoint is a robust, scalable solution for managing large fleets of browser instances. WayPoint stands out with unmatched cold‑st…☆71Updated 5 months ago
- A fully customizable and self-hosted sandboxing solution for AI agent code execution and computer use. It features out-of-the-box support…☆579Updated 3 months ago
- A structured framework for defining, verifying and certifying AI systems.☆14Updated 6 months ago
- The toolkit for AI devtools context engineering. Build with codebase mapping, symbol extraction, and many kinds of code search.☆618Updated last week
- Letting Claude Code develop his own MCP tools :)☆121Updated 6 months ago
- This repo tracks the opened and merged PRs by the top SWE coding agents by OpenAI, GitHub, and others. Updates every 3 hours.☆252Updated this week
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆265Updated last month
- The fastest, lightest, and easiest-to-integrate AI gateway on the market. Fully open-sourced.☆421Updated last month
- Training-Ready RL Environments + Evals☆111Updated this week
- Find the Root Cause in Your Code's Trace☆337Updated this week
- Easiest way to give context to LLMs; Attachments has the ambition to be the general funnel for any files to be transformed into images+te…☆298Updated 2 weeks ago
- A Text-Based Environment for Interactive Debugging☆266Updated this week
- A framework for optimizing DSPy programs with RL☆182Updated this week
- 🚀 MassGen: An Open-source Multi-Agent Scaling System Inspired by Grok Heavy and Gemini Deep Think. Join the discord channel: https://dis…☆449Updated last week
- Plug-and-play tree search for agents☆263Updated 2 months ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents☆407Updated this week