kolenaIO / autoarena
Rank LLMs, RAG systems, and prompts using automated head-to-head evaluation
☆102Updated 2 months ago
Alternatives and similar repositories for autoarena:
Users that are interested in autoarena are comparing it to the libraries listed below
- ☆115Updated 3 weeks ago
- This project enhances the construction of RAG applications by addressing challenges, improving accessibility, scalability, and managing d…☆141Updated 10 months ago
- Testing and evaluation framework for voice agents☆92Updated this week
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆85Updated 2 weeks ago
- RAGArch is a Streamlit-based application that empowers users to experiment with various components and parameters of Retrieval-Augmented …☆84Updated last year
- Tutorial for building LLM router☆181Updated 7 months ago
- ☆65Updated 4 months ago
- Contextual Doc Retrieval is a Python-based system leveraging OpenAI GPT-4o and Cohere for re-ranking and query expansion, combined with B…☆40Updated 4 months ago
- This project involves using llamaindex Multi Agents concierge system and Qdrant vector database to customize the RAG application with use…☆46Updated 6 months ago
- ☆76Updated 4 months ago
- Routing on Random Forest (RoRF)☆114Updated 4 months ago
- ☆52Updated 4 months ago
- LangEvals aggregates various language model evaluators into a single platform, providing a standard interface for a multitude of scores a…☆44Updated this week
- A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama.☆119Updated 4 months ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆76Updated last week
- A memory framework for Large Language Models and Agents.☆173Updated last month
- ☆82Updated last month
- Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph☆145Updated 10 months ago
- A Lightweight Library for AI Observability☆233Updated this week
- 🤖 Headless IDE for AI agents☆162Updated last week
- RAG with postgreSQL(nebius) and pgvector