relari-ai / continuous-evalLinks

Data-Driven Evaluation for LLM-Powered Applications

☆501

Alternatives and similar repositories for continuous-eval

Users that are interested in continuous-eval are comparing it to the libraries listed below

Sorting:

phospho-app / text-analytics-legacy
Legacy project
☆438Updated 2 weeks ago
athina-ai / athina-evals
Python SDK for running evaluations on LLM generated responses
☆289Updated last month
zenbase-ai / core
Prompt engineering, automated.
☆335Updated 3 months ago
NeumTry / NeumAI
Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.
☆859Updated last year
SciPhi-AI / agent-search
AgentSearch is a framework for powering search agents and enabling customizable local search.
☆496Updated last year
openfoundry-ai / model_manager
Model Manager is a Python package that simplifies the process of deploying an open source AI model to your own cloud.
☆325Updated last year
Trainy-ai / llm-atc
Fine-tuning and serving LLMs on any cloud
☆90Updated last year
arthur-ai / bench
A tool for evaluating LLMs
☆423Updated last year
dgarnitz / vectorflow
VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of y…
☆697Updated last year
redotvideo / haven
LLM fine-tuning and eval
☆344Updated last year
AI-Northstar-Tech / vector-io
Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, ba…
☆252Updated this week
ask-fini / paramount
Agent accuracy measurements for LLMs
☆205Updated last year
sheet0 / npi
Action library for AI Agent
☆222Updated 4 months ago
bananaml / fructose
☆745Updated last year
automorphic-ai / aegis
Self-hardening firewall for large language models
☆265Updated last year
automorphic-ai / trex
Enforce structured output from LLMs 100% of the time
☆249Updated last year
aryn-ai / sycamore
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
☆547Updated this week
Tanuki / tanuki.py
Prompt engineering for developers
☆687Updated last year
simonmesmith / agentflow
Complex LLM Workflows from Simple JSON.
☆308Updated last year
reworkd / bananalyzer
Open source AI Agent evaluation framework for web tasks 🐒🍌
☆304Updated 7 months ago
TonicAI / tonic_validate
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
☆314Updated 3 weeks ago
pchunduri6 / rag-demystified
An LLM-powered advanced RAG pipeline built from scratch
☆845Updated last year
jxnl / n-levels-of-rag
☆195Updated last year
taylorai / galactic
data cleaning and curation for unstructured text
☆328Updated 11 months ago
apache / burr
Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastr…
☆1,750Updated last week
superagent-ai / super-rag
Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.
☆380Updated last year
cohere-ai / cohere-terrarium
A simple Python sandbox for helpful LLM data agents
☆276Updated last year
tigerlab-ai / tiger
Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)
☆398Updated last year
pinecone-io / canopy
Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
☆1,019Updated 8 months ago
tensorlakeai / indexify
A realtime serving engine for Data-Intensive Generative AI Applications
☆1,041Updated last week