philschmid / evaluate-llms
Includes examples on how to evaluate LLMs
☆20Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for evaluate-llms
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 4 months ago
- Repository containing awesome resources regarding Hugging Face tooling.☆43Updated 10 months ago
- ☆24Updated last year
- ☆75Updated 5 months ago
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- Code for NeurIPS LLM Efficiency Challenge☆54Updated 7 months ago
- End-to-End LLM Guide☆97Updated 4 months ago
- PyTorch implementation for MRL☆18Updated 9 months ago
- ☆16Updated last year
- An index of all of our weekly concepts + code events for aspiring AI Engineers and Business Leaders!!☆50Updated this week
- Low latency, High Accuracy, Custom Query routers for Co-pilots and Agents. Built by Prithivi Da☆52Updated this week
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆27Updated 3 months ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆93Updated last month
- Mistral + Haystack: build RAG pipelines that rock 🤘☆100Updated 9 months ago
- Chunk your text using gpt4o-mini more accurately☆42Updated 3 months ago
- ☆41Updated last month
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning☆42Updated 11 months ago
- Set of scripts to finetune LLMs☆36Updated 7 months ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆33Updated 8 months ago
- Let's build better datasets, together!☆206Updated this week
- ☆15Updated last month
- Experimental Code for StructuredRAG: Structured Outputs in Retrieval-Augmented Generation☆94Updated this week
- Using open source LLMs to build synthetic datasets for direct preference optimization☆40Updated 8 months ago
- ☆46Updated 9 months ago
- A competition to get you started on the NeurIPS AI Hackercup☆27Updated 2 months ago
- Examples of using Evidently to evaluate, test and monitor ML models.☆18Updated this week
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆29Updated 7 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆62Updated 2 weeks ago
- Table detection with Florence.☆13Updated 4 months ago
- ☆20Updated 9 months ago