huggingface/large_language_model_training_playbook

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huggingface/large_language_model_training_playbook)

huggingface / large_language_model_training_playbook

An open collection of implementation tips, tricks and resources for training large language models

☆502

Alternatives and similar repositories for large_language_model_training_playbook

Users that are interested in large_language_model_training_playbook are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

huggingface / llm_training_handbook
View on GitHub
An open collection of methodologies to help with successful training of large language models.
☆564Feb 15, 2024Updated 2 years ago
huggingface / olm-training
View on GitHub
Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆98Feb 9, 2023Updated 3 years ago
LAION-AI / Open-Instruction-Generalist
View on GitHub
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆210Jan 13, 2024Updated 2 years ago
tunib-ai / oslo
View on GitHub
OSLO: Open Source framework for Large-scale model Optimization
☆310Aug 25, 2022Updated 3 years ago
ko-nlp / moducorpus-sanitizer
View on GitHub
모두의 말뭉치 데이터를 분석에 편리한 형태로 변환하는 기능을 제공합니다.
☆11Mar 2, 2022Updated 4 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
huggingface / fuego
View on GitHub
[WIP] A 🔥 interface for running code in the cloud
☆87May 26, 2026Updated 2 months ago
huggingface / datablations
View on GitHub
Scaling Data-Constrained Language Models
☆344Jun 28, 2025Updated last year
huggingface / olm-datasets
View on GitHub
Pipeline for pulling and processing online language model pretraining data from the web
☆179Jul 31, 2023Updated 2 years ago
tunib-ai / parallelformers
View on GitHub
Parallelformers: An Efficient Model Parallelization Toolkit for Deployment
☆788Apr 24, 2023Updated 3 years ago
sgugger / torchdynamo-tests
View on GitHub
☆20Nov 23, 2022Updated 3 years ago
jason9693 / ETA4LLMs
View on GitHub
Calculating Expected Time for training LLM.
☆39Apr 17, 2023Updated 3 years ago
huggingface / hffs
View on GitHub
**ARCHIVED** Filesystem interface to 🤗 Hub
☆60Apr 6, 2023Updated 3 years ago
facebookresearch / bart_ls
View on GitHub
Long-context pretrained encoder-decoder models
☆97Oct 28, 2022Updated 3 years ago
google-research / longt5
View on GitHub
☆183May 26, 2023Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
joeljang / ELM
View on GitHub
[ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning
☆99Apr 26, 2023Updated 3 years ago
huggingface / evaluate
View on GitHub
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
☆2,467Jul 6, 2026Updated 2 weeks ago
lgessler / microbert
View on GitHub
A tiny BERT for low-resource monolingual models
☆32Dec 24, 2025Updated 7 months ago
aliborji / ChatGPT_Failures
View on GitHub
A categorical archive of ChatGPT failures
☆64May 25, 2023Updated 3 years ago
princeton-nlp / TRIME
View on GitHub
[EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674
☆193Jun 14, 2023Updated 3 years ago
allenai / RL4LMs
View on GitHub
A modular RL library to fine-tune language models to human preferences
☆2,393Mar 1, 2024Updated 2 years ago
kakaobrain / kortok
View on GitHub
The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)
☆119Oct 8, 2020Updated 5 years ago
google-research / FLAN
View on GitHub
☆1,565Jul 2, 2026Updated 3 weeks ago
tunib-ai / large-scale-lm-tutorials
View on GitHub
Large-scale language modeling tutorials with PyTorch
☆293Nov 2, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
nateraw / huggingface-datasets-converter
View on GitHub
Scripts to convert datasets from various sources to Hugging Face Datasets.
☆57Oct 26, 2022Updated 3 years ago
adapter-hub / adapters
View on GitHub
A Unified Library for Parameter-Efficient and Modular Transfer Learning
☆2,822Apr 26, 2026Updated 2 months ago
EleutherAI / oslo
View on GitHub
OSLO: Open Source for Large-scale Optimization
☆175Sep 9, 2023Updated 2 years ago
nlpai-lab / Korean-CommonGen
View on GitHub
[Findings of NAACL2022] A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
☆11May 27, 2022Updated 4 years ago
PiotrNawrot / nanoT5
View on GitHub
Fast & Simple repository for pre-training and fine-tuning T5-style models
☆1,021Aug 21, 2024Updated last year
SKTBrain / KVQA
View on GitHub
Korean Visual Question Answering
☆59Feb 18, 2020Updated 6 years ago
CarperAI / trlx
View on GitHub
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,753Jan 8, 2024Updated 2 years ago
huggingface / trl
View on GitHub
Train transformer language models with reinforcement learning.
☆18,927Updated this week
huggingface / accelerate
View on GitHub
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…
☆9,794Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
huggingface / optimum
View on GitHub
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization…
☆3,451Updated this week
huggingface / setfit
View on GitHub
Efficient few-shot learning with Sentence Transformers
☆2,777May 26, 2026Updated 2 months ago
huggingface / alignment-handbook
View on GitHub
Robust recipes to align language models with human and AI preferences
☆5,645May 26, 2026Updated 2 months ago
stefan-it / xlm-v-experiments
View on GitHub
Experiments for XLM-V Transformers Integeration
☆13Feb 8, 2023Updated 3 years ago
microsoft / fastseq
View on GitHub
An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…
☆433Aug 17, 2022Updated 3 years ago
naver-ai / neuralwoz
View on GitHub
NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)
☆36Jul 22, 2021Updated 5 years ago
JonasGeiping / cramming
View on GitHub
Cramming the training of a (BERT-type) language model into limited compute.
☆1,367Jun 13, 2024Updated 2 years ago