philschmid / evaluate-llms
Includes examples on how to evaluate LLMs
☆20Updated 3 months ago
Alternatives and similar repositories for evaluate-llms:
Users that are interested in evaluate-llms are comparing it to the libraries listed below
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 7 months ago
- Codebase accompanying the Summary of a Haystack paper.☆74Updated 5 months ago
- ☆24Updated last year
- Mistral + Haystack: build RAG pipelines that rock 🤘☆100Updated last year
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆93Updated 2 months ago
- Set of scripts to finetune LLMs☆36Updated 10 months ago
- ☆141Updated 7 months ago
- ☆76Updated 8 months ago
- Repository containing awesome resources regarding Hugging Face tooling.☆46Updated last year
- Sample notebooks and prompts for LLM evaluation☆120Updated 2 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆100Updated 10 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆57Updated 11 months ago
- End-to-End LLM Guide☆101Updated 7 months ago
- Resources for exploring Generative Feedback Loops with Weaviate!☆36Updated last month
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆29Updated 5 months ago
- ☆18Updated 4 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆64Updated 3 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆101Updated 2 months ago
- ☆87Updated last year
- Running load tests on a FastAPI application using Locust☆12Updated 3 months ago
- Writing Blog Posts with Generative Feedback Loops!☆47Updated 11 months ago
- PyTorch implementation for MRL☆18Updated last year
- ☆45Updated 4 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated last week
- ☆32Updated 7 months ago
- Examples of using Evidently to evaluate, test and monitor ML models.☆20Updated last week
- A RAG that can scale 🧑🏻💻☆11Updated 8 months ago
- ☆16Updated last year
- Table detection with Florence.☆13Updated 7 months ago
- Streamlit app for recommending eval functions using prompt diffs☆27Updated last year