huggingface / evaluation-guidebookLinks
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
β1,882Updated last month
Alternatives and similar repositories for evaluation-guidebook
Users that are interested in evaluation-guidebook are comparing it to the libraries listed below
Sorting:
- A reading list on LLM based Synthetic Data Generation π₯β1,460Updated 5 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsβ2,108Updated this week
- Textbook on reinforcement learning from human feedbackβ1,295Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β2,927Updated this week
- Tool for generating high quality Synthetic datasetsβ1,379Updated 2 weeks ago
- Synthetic data curation for post-training and structured data extractionβ1,547Updated 3 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β2,715Updated last week
- Curated list of datasets and tools for post-training.β3,850Updated 3 months ago
- β687Updated 6 months ago
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.β1,071Updated last week
- β2,065Updated last week
- Best practices for distilling large language models.β584Updated last year
- π€ Benchmark Large Language Models Reliably On Your Dataβ411Updated last month
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,568Updated 5 months ago
- Evaluate your LLM's response with Prometheus and GPT4 π―β1,009Updated 6 months ago
- Bringing BERT into modernity via both architecture changes and scalingβ1,559Updated 4 months ago
- Minimalistic large language model 3D-parallelism trainingβ2,311Updated 2 months ago
- Fast State-of-the-Art Static Embeddingsβ1,898Updated last month
- Recipes for shrinking, optimizing, customizing cutting edge vision models. πβ1,776Updated 2 weeks ago
- Fast Semantic Text Deduplication & Filteringβ830Updated 2 weeks ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.β822Updated 3 months ago
- Automatically evaluate your LLMs in Google Colabβ664Updated last year
- A 4-hour coding workshop to understand how LLMs are implemented and usedβ1,042Updated 10 months ago
- Explore a comprehensive collection of resources, tutorials, papers, tools, and best practices for fine-tuning Large Language Models (LLMsβ¦β480Updated 11 months ago
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. β π€π€β1,077Updated 9 months ago
- Environments for LLM Reinforcement Learningβ3,475Updated this week
- Doing simple retrieval from LLM models at various context lengths to measure accuracyβ2,068Updated last year
- Stanford NLP Python library for Representation Finetuning (ReFT)β1,526Updated 9 months ago
- β1,309Updated 8 months ago
- System 2 Reasoning Link Collectionβ855Updated 7 months ago