Data-Driven Evaluation for LLM-Powered Applications
☆515Jan 22, 2025Updated last year
Alternatives and similar repositories for continuous-eval
Users that are interested in continuous-eval are comparing it to the libraries listed below
Sorting:
- PostgreSQL vector database extension for building AI applications☆874Dec 12, 2024Updated last year
- AgentSearch is a framework for powering search agents and enabling customizable local search.☆518Apr 22, 2024Updated last year
- Python SDK for running evaluations on LLM generated responses☆298Jun 6, 2025Updated 8 months ago
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.☆324Jul 10, 2025Updated 7 months ago
- ☆11Aug 26, 2024Updated last year
- A structured framework for defining, verifying and certifying AI systems.☆17Mar 11, 2025Updated 11 months ago
- Supercharge Your LLM Application Evaluations 🚀☆12,736Feb 24, 2026Updated last week
- All-in-one platform for search, recommendations, RAG, and analytics offered via API☆2,604Jan 25, 2026Updated last month
- Zep | Examples, Integrations, & More☆4,121Updated this week
- Extract valuable information from your project github Stars & Forks such as email, company, twitter and then explore it with streamlit🌟☆21Feb 8, 2024Updated 2 years ago
- SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.☆7,711Nov 7, 2025Updated 3 months ago
- Open-source BI for engineers☆2,360Feb 5, 2026Updated 3 weeks ago
- The LLM Evaluation Framework☆13,904Updated this week
- Legacy project of an analytics platform for LLM-generated content☆439Jul 17, 2025Updated 7 months ago
- Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chro…☆3,019Feb 11, 2026Updated 3 weeks ago
- UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured ch…☆2,338Aug 18, 2024Updated last year
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,859May 17, 2025Updated 9 months ago
- The modern replacement for Jupyter Notebooks☆2,185Dec 1, 2024Updated last year
- Model Manager is a Python package that simplifies the process of deploying an open source AI model to your own cloud.☆338May 20, 2024Updated last year
- Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude,…☆10,691Updated this week
- Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastr…☆1,935Updated this week
- Run evals using LLM☆27Jan 8, 2026Updated last month
- Evaluation and Tracking for LLM Experiments and AI Agents☆3,124Updated this week
- DSPy: The framework for programming—not prompting—language models☆32,519Updated this week
- 🪓 Run Background Tasks at Scale☆6,664Updated this week
- Neural Search☆367Mar 11, 2025Updated 11 months ago
- AI Observability & Evaluation☆8,666Feb 26, 2026Updated last week
- structured outputs for llms☆12,468Feb 25, 2026Updated last week
- Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Manageme…☆2,251Updated this week
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆1,097Feb 2, 2025Updated last year
- 🐢 Open-Source Evaluation & Testing library for LLM Agents☆5,141Updated this week
- Dockerized LLM inference server with constrained output (JSON mode), built on top of vLLM and outlines. Faster, cheaper and without rate …☆27Feb 17, 2024Updated 2 years ago
- An open-source visual programming environment for battle-testing prompts to LLMs.☆2,950Jan 2, 2026Updated 2 months ago
- Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.☆388Apr 30, 2024Updated last year
- Structured Outputs☆13,488Updated this week
- Harness LLMs with Multi-Agent Programming☆3,921Updated this week
- Easy token price estimates for 400+ LLMs. TokenOps.☆1,944Sep 5, 2025Updated 5 months ago
- Adding guardrails to large language models.☆6,459Feb 24, 2026Updated last week
- ☆749Apr 17, 2024Updated last year