alopatenko / LLMEvaluation
A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use cases, promote the adoption of best practices in LLM assessment, and critically assess the effectiveness of these evaluation methods.
☆87Updated this week
Alternatives and similar repositories for LLMEvaluation:
Users that are interested in LLMEvaluation are comparing it to the libraries listed below
- Sample notebooks and prompts for LLM evaluation☆119Updated 2 months ago
- ☆138Updated 6 months ago
- ☆76Updated 7 months ago
- ☆147Updated last month
- A set of scripts and notebooks on LLM finetunning and dataset creation☆101Updated 4 months ago
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆63Updated 11 months ago
- ☆77Updated 8 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆91Updated last month
- This project showcases an LLMOps pipeline that fine-tunes a small-size LLM model to prepare for the outage of the service LLM.☆294Updated last month
- awesome synthetic (text) datasets☆256Updated 3 months ago
- ☆207Updated 6 months ago
- Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.☆165Updated 9 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆101Updated 9 months ago
- Codebase accompanying the Summary of a Haystack paper.☆74Updated 4 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated 9 months ago
- ☆163Updated 7 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated last month
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆99Updated last year
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆251Updated last month
- End-to-End LLM Guide☆99Updated 6 months ago
- Attribute (or cite) statements generated by LLMs back to in-context information.☆190Updated 3 months ago
- Starter pack for NeurIPS LLM Efficiency Challenge 2023.☆123Updated last year
- Late Interaction Models Training & Retrieval☆228Updated this week
- Let's build better datasets, together!☆250Updated last month
- ☆57Updated 2 months ago
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆37Updated 9 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆249Updated 6 months ago
- ☆80Updated last week
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆401Updated 11 months ago
- FastAPI wrapper around DSPy☆230Updated 10 months ago