AlmogBaku / pytest-evalsLinks
A pytest plugin for running and analyzing LLM evaluation tests.
☆138Updated 6 months ago
Alternatives and similar repositories for pytest-evals
Users that are interested in pytest-evals are comparing it to the libraries listed below
Sorting:
- ☆74Updated 5 months ago
- Python library that allows you to get structured responses in the form of Pydantic models and Python types from Anthropic, Google Vertex …☆79Updated last year
- Pydantic extension for annotating autocorrecting fields.☆222Updated last year
- OpenTelemetry Instrumentation for AI Observability☆568Updated this week
- Python SDK for Inngest: Durable functions and workflows in Python, hosted anywhere☆122Updated this week
- Promptimize is a prompt engineering evaluation and testing toolkit.☆480Updated 3 weeks ago
- LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. I…☆114Updated last month
- Python browser sandbox.☆177Updated 4 months ago
- Claudette is Claude's friend☆265Updated 3 weeks ago
- LLM abstractions that aren't obstructions☆1,250Updated last week
- 🪢 Langfuse Python SDK - Instrument your LLM app with decorators or low-level SDK and get detailed tracing/observability. Works with any …☆247Updated this week
- Additional packages (components, document stores and the likes) to extend the capabilities of Haystack☆164Updated this week
- Convert an AI Agent into a A2A server! ✨☆105Updated last month
- Work with OpenAI's streaming API at ease with Python generators☆122Updated last year
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆302Updated 2 months ago
- A Lightweight Library for AI Observability☆250Updated 6 months ago
- ☆167Updated 2 weeks ago
- ☆81Updated 9 months ago
- Bringing Generative AI to the way the Civil Service works☆132Updated this week
- ☆166Updated this week
- Flexible and lightweight library for creating prompt templates☆17Updated this week
- Named Entity Recognition using Claude Citations☆79Updated 2 months ago
- A pattern to let you try several vector databases and change a little code as possible☆38Updated 2 years ago
- A python implementation of priompt - a neat way of managing context from diverse sources for LLM applications.☆112Updated last month
- Calculate prices for calling LLM inference APIs.☆82Updated last week
- SUQL: Conversational Search over Structured and Unstructured Data with LLMs☆279Updated last month
- Transform your pythonic research to an artifact that engineers can deploy easily.☆154Updated 2 months ago
- Leverage your LangChain trace data for fine tuning☆44Updated last year
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.☆317Updated last month
- OpenAI powered AI CLI in just a few lines of code.☆124Updated last year