braintrustdata / autoevalsLinks
AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.
☆736Updated last week
Alternatives and similar repositories for autoevals
Users that are interested in autoevals are comparing it to the libraries listed below
Sorting:
- structured extraction for llms☆759Updated 10 months ago
- Create state-machine-powered LLM agents using XState☆325Updated 6 months ago
- ☆105Updated last week
- ☆155Updated 5 months ago
- Prompt engineering, automated.☆349Updated 7 months ago
- AgentKit: Build multi-agent networks in TypeScript with deterministic routing and rich tooling via MCP.☆704Updated this week
- Get structured, fully typed, and validated JSON outputs from OpenAI and Anthropic models.☆629Updated last year
- The pretty much "official" DSPy framework for Typescript☆2,303Updated this week
- Python SDK for running evaluations on LLM generated responses☆293Updated 6 months ago
- Low latency JSON generation using LLMs ⚡️☆398Updated last year
- Sister project to OpenLLMetry, but in Typescript. Open-source observability for your LLM application, based on OpenTelemetry☆369Updated last week
- ☆201Updated 2 weeks ago
- Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.☆169Updated last year
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆372Updated 3 months ago
- Chat with your PostHog data☆161Updated last year
- Evaluate your LLM-powered apps with TypeScript☆1,245Updated last week
- ☆377Updated last week
- The TypeScript LLM Evaluation Library☆150Updated 3 weeks ago
- A fuzzy key value store based on semantic similarity rather lexical equality.☆288Updated last year
- Reasoning Augmented Generation☆889Updated 4 months ago
- Inference-time scaling for LLMs-as-a-judge.☆316Updated last month
- Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, ba…☆264Updated last week
- Provider-agnostic, open-source evaluation infrastructure for language models☆681Updated this week
- Automatically reformat any JSON into any schema with AI☆339Updated 8 months ago
- A tool kit for generating high quality prompts using DSPy GEPA optimizer☆286Updated last month
- Simple AI coder that can do most of my work for me, including working on himself.☆252Updated 8 months ago
- ☆46Updated this week
- ☆173Updated last week
- smol-podcaster is your podcast production agent 🎙️☆373Updated 3 weeks ago
- ☆169Updated last year