braintrustdata / autoevalsLinks
AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.
☆592Updated this week
Alternatives and similar repositories for autoevals
Users that are interested in autoevals are comparing it to the libraries listed below
Sorting:
- structured extraction for llms☆747Updated 6 months ago
- Low latency JSON generation using LLMs ⚡️☆399Updated last year
- Prompt engineering, automated.☆336Updated 4 months ago
- ☆83Updated this week
- Get structured, fully typed, and validated JSON outputs from OpenAI and Anthropic models.☆626Updated last year
- Create state-machine-powered LLM agents using XState☆303Updated 3 months ago
- ☆154Updated 2 months ago
- 🦄 ai that works - every tuesday 10 AM PST☆232Updated this week
- ☆163Updated last week
- AgentKit: Build multi-agent networks in TypeScript with deterministic routing and rich tooling via MCP.☆577Updated this week
- Python SDK for running evaluations on LLM generated responses☆292Updated 2 months ago
- Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.☆169Updated last year
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆296Updated last month
- Sister project to OpenLLMetry, but in Typescript. Open-source observability for your LLM application, based on OpenTelemetry☆350Updated this week
- The pretty much "official" DSPy framework for Typescript☆1,780Updated last week
- Automatically reformat any JSON into any schema with AI☆334Updated 5 months ago
- Reasoning Augmented Generation☆873Updated last month
- Add generative UI components to your AI assistant, copilot, or agent.☆614Updated this week
- Chat with your PostHog data☆162Updated last year
- A fuzzy key value store based on semantic similarity rather lexical equality.☆285Updated 9 months ago
- ☆155Updated this week
- The toolkit for AI devtools context engineering. Build with codebase mapping, symbol extraction, and many kinds of code search.☆601Updated last week
- ☆366Updated this week
- ☆37Updated 3 weeks ago
- A Markdown-like syntax for writing prompts. Includes an in-editor playground.☆126Updated 2 years ago
- The React library for LLMs☆1,590Updated last month
- ☆169Updated last year
- Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, ba…☆253Updated last week
- Build hours code to share.☆452Updated this week
- Simple AI coder that can do most of my work for me, including working on himself.☆248Updated 4 months ago