aiverify-foundation / LLM-Evals-CatalogueLinks
This repository stems from our paper, “Cataloguing LLM Evaluations”, and serves as a living, collaborative catalogue of LLM evaluation frameworks, benchmarks and papers.
☆18Updated 2 years ago
Alternatives and similar repositories for LLM-Evals-Catalogue
Users that are interested in LLM-Evals-Catalogue are comparing it to the libraries listed below
Sorting:
- Sample notebooks and prompts for LLM evaluation☆156Updated 3 weeks ago
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆154Updated this week
- An index of all of our weekly concepts + code events for aspiring AI Engineers and Business Leaders!!☆91Updated last week
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆114Updated last year
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆115Updated 4 months ago
- ☆74Updated last year
- ☆20Updated last year
- ☆146Updated last year
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆138Updated 3 months ago
- EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other la…☆92Updated last week
- Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph☆147Updated last year
- A framework for fine-tuning retrieval-augmented generation (RAG) systems.☆136Updated this week
- LangFair is a Python library for conducting use-case level LLM bias and fairness assessments☆242Updated last week
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆123Updated 3 weeks ago
- ☆30Updated last year
- ☆89Updated 6 months ago
- A Lightweight Library for AI Observability☆251Updated 9 months ago
- What, Why and How of LLMs.☆75Updated 2 months ago
- ☆124Updated 9 months ago
- ☆163Updated 9 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆122Updated 9 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆179Updated last year
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆76Updated 7 months ago
- ☆38Updated last year
- Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)☆263Updated last year
- A curated list of awesome synthetic data tools (open source and commercial).☆222Updated last year
- Official Implementation of "Affordable AI Assistants with Knowledge Graph of Thoughts"☆195Updated last month
- ☆89Updated last year
- Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.☆167Updated last year
- Automated knowledge graph creation SDK☆122Updated last year