braintrustdata / autoevalsLinks
AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.
☆490Updated 2 weeks ago
Alternatives and similar repositories for autoevals
Users that are interested in autoevals are comparing it to the libraries listed below
Sorting:
- structured extraction for llms☆720Updated 4 months ago
- Sister project to OpenLLMetry, but in Typescript. Open-source observability for your LLM application, based on OpenTelemetry☆319Updated last month
- ☆151Updated 4 months ago
- Prompt engineering, automated.☆321Updated last month
- Logging and caching superpowers for the openai sdk☆105Updated last year
- Python SDK for running evaluations on LLM generated responses☆280Updated 2 weeks ago
- ☆33Updated last week
- ☆348Updated this week
- ☆194Updated last year
- Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.☆170Updated last year
- ☆130Updated this week
- Low latency JSON generation using LLMs ⚡️☆399Updated last year
- ☆63Updated this week
- AgentKit: Build multi-agent networks in TypeScript with deterministic routing and rich tooling via MCP.☆454Updated last week
- Automatically reformat any JSON into any schema with AI☆330Updated 2 months ago
- TypeScript client for OpenAI's realtime voice API.☆348Updated 7 months ago
- Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, ba…☆242Updated last week
- ☆405Updated 9 months ago
- LLM-ready data connectors☆81Updated last year
- Tutorial for building LLM router☆206Updated 10 months ago
- Readymade evaluators for your LLM apps☆538Updated last week
- OpenTelemetry Instrumentation for AI Observability☆442Updated this week
- ☆355Updated 2 weeks ago
- Create state-machine-powered LLM agents using XState☆291Updated last week
- The toolkit for codebase mapping, symbol extraction, and many kinds of code search. Build AI-powered devtools☆443Updated this week
- ☆135Updated this week
- Reasoning Augmented Generation☆845Updated 3 months ago
- A fuzzy key value store based on semantic similarity rather lexical equality.☆276Updated 6 months ago
- The "official" unofficial DSPy framework. Build LLM powered agents and other workflows, based on the Stanford DSP paper.☆1,499Updated this week
- HumanLayer enables AI agents to communicate with humans in tool-based and async workflows. Guarantee human oversight of high-stakes funct…☆806Updated this week