AlmogBaku / pytest-evalsLinks
A pytest plugin for running and analyzing LLM evaluation tests.
☆127Updated 4 months ago
Alternatives and similar repositories for pytest-evals
Users that are interested in pytest-evals are comparing it to the libraries listed below
Sorting:
- Work with OpenAI's streaming API at ease with Python generators☆121Updated last year
- Transform your pythonic research to an artifact that engineers can deploy easily.☆154Updated last week
- A WhatsApp bot that can participate in group conversations, powered by AI. The bot monitors group messages and responds when mentioned.☆78Updated last week
- Record your service operations in production and replay them locally at any time in a sandbox☆106Updated 5 months ago
- ☆10Updated 10 months ago
- Python library that allows you to get structured responses in the form of Pydantic models and Python types from Anthropic, Google Vertex …☆78Updated 11 months ago
- A documentation assistant leveraging Model Context Protocol (MCP) to help programmers access the most up-to-date and relevant information…☆19Updated 3 months ago
- HyPSTER - HyperParameter optimization on STERoids☆48Updated 7 months ago
- MCP server for Israel Government Data☆60Updated this week
- A tiny LLM Agent with minimal dependencies, focused on local inference.☆53Updated 8 months ago
- Curated list of tools and frameworks assisting in monitoring data quality☆12Updated 3 years ago
- Metafeature Extraction for Unstructured Data☆102Updated 3 months ago
- A powerful AI observability framework that provides comprehensive insights into agent interactions across platforms, enabling developers …☆86Updated last month
- Self Support ChatBot☆16Updated 3 months ago
- 🪢 Langfuse Python SDK - Instrument your LLM app with decorators or low-level SDK and get detailed tracing/observability. Works with any …☆203Updated last week
- A plugin-based gateway that orchestrates other MCPs and allows developers to build upon it enterprise-grade agents.☆207Updated 2 months ago
- A Lightweight Library for AI Observability☆246Updated 4 months ago
- Constrain LLM output☆112Updated 11 months ago
- A small library of LLM judges☆216Updated last week
- Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications☆98Updated 8 months ago
- ☆72Updated 7 months ago
- OpenTelemetry Instrumentation for AI Observability☆480Updated this week
- ☆37Updated 2 weeks ago
- The Logfire MCP Server is here!☆78Updated last month
- A python implementation of priompt - a neat way of managing context from diverse sources for LLM applications.☆111Updated 10 months ago
- Transform any OpenAPI/Swagger definition into a fully-featured Model Context Protocol (MCP) server☆152Updated 2 weeks ago
- Testing and evaluation framework for voice agents☆124Updated 3 weeks ago
- A lightweight tool that lets you simply build prompts and get Pydantic objects as outputs☆19Updated 3 weeks ago
- Rank LLMs, RAG systems, and prompts using automated head-to-head evaluation☆104Updated 6 months ago
- Open-source AI copilot that lets you chat with your observability data and code 🧙♂️☆351Updated 2 months ago