braintrustdata / autoevalsLinks

AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.

☆577

Alternatives and similar repositories for autoevals

Users that are interested in autoevals are comparing it to the libraries listed below

Sorting:

567-labs / instructor-js
structured extraction for llms
☆738Updated 6 months ago
braintrustdata / braintrust-sdk
☆78Updated this week
inngest / agent-kit
AgentKit: Build multi-agent networks in TypeScript with deterministic routing and rich tooling via MCP.
☆552Updated last week
hack-dance / island-ai
☆152Updated last month
statelyai / agent
Create state-machine-powered LLM agents using XState
☆299Updated 2 months ago
traceloop / openllmetry-js
Sister project to OpenLLMetry, but in Typescript. Open-source observability for your LLM application, based on OpenTelemetry
☆344Updated last week
zenbase-ai / core
Prompt engineering, automated.
☆335Updated 3 months ago
dzhng / llm-api
Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.
☆168Updated last year
athina-ai / athina-evals
Python SDK for running evaluations on LLM generated responses
☆291Updated last month
tambo-ai / tambo
Add generative UI components to your AI assistant, copilot, or agent.
☆506Updated this week
varunshenoy / super-json-mode
Low latency JSON generation using LLMs ⚡️
☆400Updated last year
dzhng / zod-gpt
Get structured, fully typed, and validated JSON outputs from OpenAI and Anthropic models.
☆624Updated last year
ax-llm / ax
The pretty much "official" DSPy framework for Typescript
☆1,725Updated this week
cased / kit
The toolkit for AI devtools context engineering. Build with codebase mapping, symbol extraction, and many kinds of code search.
☆584Updated last week
upstash / semantic-cache
A fuzzy key value store based on semantic similarity rather lexical equality.
☆281Updated 8 months ago
AgentOps-AI / Jaiqu
Automatically reformat any JSON into any schema with AI
☆333Updated 4 months ago
RhysSullivan / hogchat
Chat with your PostHog data
☆162Updated last year
braintrustdata / braintrust-cookbook
☆36Updated 3 weeks ago
superagent-ai / reag
Reasoning Augmented Generation
☆872Updated 3 weeks ago
hellovai / ai-that-works
🦄 ai that works - every tuesday 10 AM PST
☆189Updated this week
braintrustdata / braintrust-proxy
☆361Updated this week
abshkbh / arrakis
A fully customizable and self-hosted sandboxing solution for AI agent code execution and computer use. It features out-of-the-box support…
☆556Updated 2 months ago
promptfile / promptfile
A Markdown-like syntax for writing prompts. Includes an in-editor playground.
☆125Updated 2 years ago
hrishioa / mandark
Simple AI coder that can do most of my work for me, including working on himself.
☆246Updated 3 months ago
vercel / mcp-adapter
Easily spin up an MCP Server on Next.js, Nuxt, Svelte, and more
☆259Updated last week
567-labs / kura
Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…
☆259Updated last month
567-labs / systematically-improving-rag
☆156Updated this week
Helicone / ai-gateway
The fastest, lightest, and easiest-to-integrate AI gateway on the market. Fully open-sourced.
☆364Updated this week
richardgill / llm-ui
The React library for LLMs
☆1,554Updated last month
jerhadf / linear-mcp-server
A server that integrates Linear's project management system with the Model Context Protocol (MCP) to allow LLMs to interact with Linear.
☆318Updated 3 months ago