AlmogBaku / pytest-evals
A pytest plugin for running and analyzing LLM evaluation tests.
☆121Updated 3 months ago
Alternatives and similar repositories for pytest-evals:
Users that are interested in pytest-evals are comparing it to the libraries listed below
- Work with OpenAI's streaming API at ease with Python generators☆121Updated 11 months ago
- ☆24Updated last week
- Transform your pythonic research to an artifact that engineers can deploy easily.☆153Updated last month
- ☆10Updated 8 months ago
- HyPSTER - HyperParameter optimization on STERoids☆48Updated 5 months ago
- A documentation assistant leveraging Model Context Protocol (MCP) to help programmers access the most up-to-date and relevant information…☆19Updated last month
- A Lightweight Library for AI Observability☆243Updated 2 months ago
- A small library of LLM judges☆185Updated last week
- 🪢 Langfuse Python SDK - Instrument your LLM app with decorators or low-level SDK and get detailed tracing/observability. Works with any …☆168Updated last week
- Portia Labs Python SDK for building agentic workflows.☆132Updated this week
- A tiny LLM Agent with minimal dependencies, focused on local inference.☆52Updated 7 months ago
- Metafeature Extraction for Unstructured Data☆101Updated last month
- Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)☆263Updated last year
- MCPEngine is a client, server, and proxy implementation of model context protocol (MCP) specifically oriented towards Enterprise and real…☆186Updated this week
- Pydantic extension for annotating autocorrecting fields.☆220Updated 10 months ago
- LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. I…☆92Updated last week
- Readymade evaluators for agent trajectories☆183Updated this week
- A plugin-based gateway that orchestrates other MCPs and allows developers to build upon it enterprise-grade agents.☆149Updated 2 weeks ago
- An open-source prompt engineering framework.☆39Updated this week
- OpenTelemetry Instrumentation for AI Observability☆399Updated this week
- Self Support ChatBot☆16Updated last month
- Python library that allows you to get structured responses in the form of Pydantic models and Python types from Anthropic, Google Vertex …☆78Updated 9 months ago
- Agent File (.af): An open file format for serializing stateful AI agents with persistent memory and behavior. Share, checkpoint, and vers…☆389Updated last month
- ☆118Updated last week
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆37Updated last year
- Python SDK for running evaluations on LLM generated responses☆278Updated last week
- This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and busine…☆150Updated 7 months ago
- The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applicati…☆237Updated 7 months ago
- An AI extension for IPython that makes it work like Cursor☆66Updated 4 months ago
- An fsspec implementation for the lakeFS project☆49Updated last month