braintrustdata / autoevals
AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.
☆248Updated this week
Related projects ⓘ
Alternatives and complementary repositories for autoevals
- ☆127Updated this week
- Prompt engineering, automated.☆246Updated 2 weeks ago
- Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.☆341Updated 6 months ago
- Low latency JSON generation using LLMs ⚡️☆386Updated 8 months ago
- A simple Python sandbox for helpful LLM data agents☆170Updated 5 months ago
- Python SDK for running evaluations on LLM generated responses☆221Updated this week
- This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and busine…☆142Updated last month
- ⛓️ build cognitive systems, pythonic☆326Updated this week
- Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.☆152Updated 5 months ago
- Text analytics for LLM apps. Cluster messages to detect use cases, outliers, power users. Detect intents and run evals with LLM (OpenAI, …☆377Updated this week
- Automatically reformat any JSON into any schema with AI☆301Updated last month
- ☆182Updated 6 months ago
- Sister project to OpenLLMetry, but in Typescript. Open-source observability for your LLM application, based on OpenTelemetry☆266Updated last week
- ☆67Updated 3 weeks ago
- Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, ba…☆219Updated this week
- ☆114Updated 5 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆221Updated 6 months ago
- Build hours code to share.☆137Updated last week
- Python and TypeScript library for integrating the Stripe API into agentic workflows☆107Updated this week
- LLM fine-tuning and eval☆341Updated 8 months ago
- Work with web-enabled agents quickly — whether running a quick task or bootstrapping a full-stack product.☆90Updated 3 weeks ago
- LLM-ready data connectors☆62Updated 5 months ago
- Logging and caching superpowers for the openai sdk☆100Updated 8 months ago
- ☆136Updated 11 months ago
- ☆267Updated 2 weeks ago
- WIP - Allows you to create DSPy pipelines using ComfyUI☆180Updated 3 months ago
- Infrastructure for AI code interpreting that's powering E2B.☆220Updated this week
- Action library for AI Agent☆191Updated 2 weeks ago