eval-protocol / python-sdkLinks
The official Python SDK for Eval Protocol
☆65Updated this week
Alternatives and similar repositories for python-sdk
Users that are interested in python-sdk are comparing it to the libraries listed below
Sorting:
- Provider-agnostic, open-source evaluation infrastructure for language models☆641Updated last week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆738Updated this week
- An interface library for RL post training with environments.☆628Updated this week
- A fully customizable and self-hosted sandboxing solution for AI agent code execution and computer use. It features out-of-the-box support…☆647Updated 5 months ago
- Benchmark and optimize LLM inference across frameworks with ease☆129Updated last month
- vscode extension to convert computationally intensive pytorch kernels to triton☆22Updated last year
- A CLI for GPUs☆115Updated 2 weeks ago
- ☆231Updated 4 months ago
- An MCP Server that's also an MCP Client. Useful for letting Claude develop and test MCPs without needing to reset the application.☆124Updated 8 months ago
- Prompt engineering, automated.☆346Updated 6 months ago
- Easiest way to give context to LLMs; Attachments has the ambition to be the general funnel for any files to be transformed into images+te…☆318Updated last month
- Super basic implementation (gist-like) of RLMs with REPL environments.☆242Updated 3 weeks ago
- Verifiers for LLM Reinforcement Learning☆77Updated last month
- 🤖 Headless IDE for AI agents☆201Updated last month
- Open collaboration infrastructure that enables communication, coordination, trust and payments for The Internet of Agents.☆199Updated this week
- The fastest, lightest, and easiest-to-integrate AI gateway on the market. Fully open-sourced.☆452Updated 3 months ago
- Async RL Training at Scale☆734Updated last week
- Claude Deep Research config for Claude Code.☆223Updated 7 months ago
- Letting Claude Code develop his own MCP tools :)☆123Updated 8 months ago
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆268Updated 3 weeks ago
- 🚀 MassGen is an open-source multi-agent scaling system that runs in your terminal, autonomously orchestrating frontier models and agents…☆585Updated last week
- A structured framework for defining, verifying and certifying AI systems.☆16Updated 7 months ago
- An assistant for Slack built with Arcade and Langgraph. Interact with Google Calendar, Mail, Github, Search Engines, Firecrawl and more a…☆111Updated 4 months ago
- FACT – Fast Augmented Context Tools: FACT is a lean retrieval pattern that skips vector search. We cache every static token inside Claude…☆121Updated 3 months ago
- Inference-time scaling for LLMs-as-a-judge.☆307Updated last month
- Deep Research for your internal data☆346Updated 5 months ago
- Workflows are an event-driven, async-first, step-based way to control the execution flow of AI applications like agents.☆239Updated this week
- Training-Ready RL Environments + Evals☆164Updated this week
- Plug-and-play tree search for agents☆267Updated 3 months ago
- Managed Agent Posttraining☆52Updated last week