athina-ai / ariadne
LLM Evals for Text Summarization and RAG use-cases.
☆35Updated 11 months ago
Alternatives and similar repositories for ariadne:
Users that are interested in ariadne are comparing it to the libraries listed below
- ☆195Updated 8 months ago
- Fine-tuning and serving LLMs on any cloud☆87Updated last year
- Prompt engineering, automated.☆260Updated last month
- Python SDK for running evaluations on LLM generated responses☆253Updated last week
- Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, ba…☆224Updated this week
- ☆75Updated 11 months ago
- The Identity layer for the agentic world☆159Updated this week
- Synthetic Data for LLM Fine-Tuning☆107Updated last year
- ☆79Updated this week
- Python client library for improving your LLM app accuracy☆96Updated this week
- A strongly typed Python DSL for developing message passing multi agent systems☆51Updated 9 months ago
- ⛓️ build cognitive systems, pythonic☆328Updated 2 months ago
- data cleaning and curation for unstructured text☆329Updated 5 months ago
- LangChain chat model abstractions for dynamic failover, load balancing, chaos engineering, and more!☆79Updated 11 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated 9 months ago
- LangEvals aggregates various language model evaluators into a single platform, providing a standard interface for a multitude of scores a…☆43Updated this week
- Fiddler Auditor is a tool to evaluate language models.☆174Updated 10 months ago
- Logging and caching superpowers for the openai sdk☆102Updated 10 months ago
- Red-Teaming Language Models with DSPy☆154Updated 9 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆90Updated last month
- Anthropic Claude2 Hackathon:Building MCTS with Claude for optimal action prediction during patient/doctor interactions.☆107Updated last year
- Using various instructor clients evaluating the quality and capabilities of extractions and reasoning.☆48Updated 3 months ago
- Leverage your LangChain trace data for fine tuning☆40Updated 5 months ago
- GPT-based Conversation Summarizer☆147Updated last year
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆60Updated 3 months ago
- Data-Driven Evaluation for LLM-Powered Applications☆463Updated last week
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆73Updated this week
- Fluid Database☆115Updated 3 months ago
- ☆57Updated last year