Best practices for distilling large language models.
☆609Feb 1, 2024Updated 2 years ago
Alternatives and similar repositories for llm_distillation_playbook
Users that are interested in llm_distillation_playbook are comparing it to the libraries listed below
Sorting:
- This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicit…☆1,262Mar 9, 2025Updated last year
- An Open Source Toolkit For LLM Distillation☆880Dec 21, 2025Updated 2 months ago
- A pipeline for LLM knowledge distillation☆112Apr 2, 2025Updated 11 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆3,732May 21, 2025Updated 9 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆96May 5, 2025Updated 10 months ago
- Robust recipes to align language models with human and AI preferences☆5,510Sep 8, 2025Updated 6 months ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆252Mar 13, 2025Updated 11 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆3,114Mar 2, 2026Updated last week
- Tools for merging pretrained large language models.☆6,842Feb 28, 2026Updated last week
- ☆584Sep 7, 2023Updated 2 years ago
- Go ahead and axolotl questions☆11,395Updated this week
- General technology for enabling AI capabilities w/ LLMs and MLLMs☆4,292Dec 22, 2025Updated 2 months ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,206Mar 1, 2026Updated last week
- S-LoRA: Serving Thousands of Concurrent LoRA Adapters☆1,900Jan 21, 2024Updated 2 years ago
- Serving multiple LoRA finetuned LLM as one☆1,144May 8, 2024Updated last year
- LLM Finetuning with peft☆2,822Aug 1, 2025Updated 7 months ago
- Minimalistic large language model 3D-parallelism training☆2,588Feb 19, 2026Updated 2 weeks ago
- Implementing BERT + CRF with PyTorch for Chinese NER.☆10Mar 7, 2022Updated 4 years ago
- Codes, scripts, and notebooks on various aspects of transformer models.☆27Feb 27, 2023Updated 3 years ago
- PyTorch native post-training library☆5,697Updated this week
- Structured Outputs☆13,488Mar 2, 2026Updated last week
- DSPy: The framework for programming—not prompting—language models☆32,519Updated this week
- A framework for few-shot evaluation of language models.☆11,618Updated this week
- Data and tools for generating and inspecting OLMo pre-training data.☆1,434Nov 5, 2025Updated 4 months ago
- Image Search Engine with HuggingFace Sentence Transformer☆12Aug 31, 2023Updated 2 years ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,868May 17, 2025Updated 9 months ago
- Automatically evaluate your LLMs in Google Colab☆687May 7, 2024Updated last year
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆26Feb 9, 2026Updated last month
- This repository contains the implementation of evaluation metrics for recommendation systems. We have compared similarity, candidate gene…☆27Feb 21, 2025Updated last year
- This repository contains demos I made with the Transformers library by HuggingFace.☆11,511Updated this week
- ☆3,082Nov 21, 2025Updated 3 months ago
- A guidance language for controlling large language models.☆21,333Feb 13, 2026Updated 3 weeks ago
- Efficient Triton Kernels for LLM Training☆6,189Updated this week
- AllenAI's post-training codebase☆3,614Updated this week
- Stanford NLP Python library for Representation Finetuning (ReFT)☆1,560Jan 14, 2026Updated last month
- Machine Learning Engineering Open Book☆17,286Feb 21, 2026Updated 2 weeks ago
- Curated list of datasets and tools for post-training.☆4,274Nov 10, 2025Updated 3 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,915Mar 3, 2026Updated last week
- Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads☆2,714Jun 25, 2024Updated last year