tcapelle / mistral_wandbLinks
A full fledged mistral+wandb
☆13Updated last year
Alternatives and similar repositories for mistral_wandb
Users that are interested in mistral_wandb are comparing it to the libraries listed below
Sorting:
- A small library of LLM judges☆319Updated 6 months ago
- ☆147Updated last year
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆174Updated last week
- ☆80Updated last year
- Includes examples on how to evaluate LLMs☆23Updated last year
- ☆43Updated last year
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆68Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆51Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆80Updated last year
- WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting.☆61Updated last month
- ☆31Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆126Updated 3 months ago
- Notebooks for training universal 0-shot classifiers on many different tasks☆139Updated last year
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆107Updated 4 months ago
- ☆54Updated last year
- awesome synthetic (text) datasets☆321Updated 3 weeks ago
- Attribute (or cite) statements generated by LLMs back to in-context information.☆317Updated last year
- Sample notebooks and prompts for LLM evaluation☆159Updated 3 months ago
- ☆48Updated 2 months ago
- ☆50Updated last year
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆103Updated 5 months ago
- The repository contains generative AI analytics platform application code.☆28Updated 4 months ago
- ☆30Updated 2 years ago
- ☆129Updated last year
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆113Updated last year
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆174Updated last week
- ☆23Updated 2 years ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Updated last year
- An attribution library for LLMs☆46Updated last year
- Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models☆26Updated 8 months ago