braintrustdata / autoevalsLinks
AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.
☆645Updated 3 weeks ago
Alternatives and similar repositories for autoevals
Users that are interested in autoevals are comparing it to the libraries listed below
Sorting:
- structured extraction for llms☆755Updated 8 months ago
- ☆154Updated 3 months ago
- ☆92Updated this week
- Create state-machine-powered LLM agents using XState☆317Updated 4 months ago
- AgentKit: Build multi-agent networks in TypeScript with deterministic routing and rich tooling via MCP.☆610Updated last week
- Low latency JSON generation using LLMs ⚡️☆399Updated last year
- The toolkit for AI devtools context engineering. Build with codebase mapping, symbol extraction, and many kinds of code search.☆632Updated last week
- Prompt engineering, automated.☆342Updated 5 months ago
- Optimize prompts, code, and more with AI-powered Reflective Text Evolution☆883Updated last week
- The pretty much "official" DSPy framework for Typescript☆2,035Updated this week
- Sister project to OpenLLMetry, but in Typescript. Open-source observability for your LLM application, based on OpenTelemetry☆358Updated last week
- A fully customizable and self-hosted sandboxing solution for AI agent code execution and computer use. It features out-of-the-box support…☆624Updated 4 months ago
- Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.☆169Updated last year
- Get structured, fully typed, and validated JSON outputs from OpenAI and Anthropic models.☆626Updated last year
- Compose data structures, serialize them to prompts.☆66Updated 3 months ago
- Evaluate your LLM-powered apps with TypeScript☆902Updated last month
- Add generative UI components to your AI assistant, copilot, or agent.☆764Updated this week
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆330Updated 3 weeks ago
- Reasoning Augmented Generation☆879Updated 2 months ago
- ☆182Updated this week
- A fuzzy key value store based on semantic similarity rather lexical equality.☆285Updated 10 months ago
- Provider-agnostic, open-source evaluation infrastructure for language models☆558Updated this week
- Automatically reformat any JSON into any schema with AI☆335Updated 6 months ago
- Chat with your PostHog data☆160Updated last year
- The React library for LLMs☆1,634Updated 3 months ago
- A Markdown-like syntax for writing prompts. Includes an in-editor playground.☆127Updated 2 years ago
- ☆168Updated last week
- Python SDK for running evaluations on LLM generated responses☆292Updated 4 months ago
- A tool kit for generating high quality prompts using DSPy GEPA optimizer☆198Updated 2 weeks ago
- The fastest, lightest, and easiest-to-integrate AI gateway on the market. Fully open-sourced.☆433Updated 2 months ago