huggingface / llm_training_handbookLinks

An open collection of methodologies to help with successful training of large language models.

☆536

Alternatives and similar repositories for llm_training_handbook

Users that are interested in llm_training_handbook are comparing it to the libraries listed below

Sorting:

huggingface / large_language_model_training_playbook
An open collection of implementation tips, tricks and resources for training large language models
☆482Updated 2 years ago
epfLLM / Megatron-LLM
distributed trainer for LLMs
☆583Updated last year
huggingface / cosmopedia
☆546Updated 11 months ago
arielnlee / Platypus
Code for fine-tuning Platypus fam LLMs using LoRA
☆629Updated last year
tomaarsen / attention_sinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
☆723Updated last year
declare-lab / instruct-eval
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
☆548Updated last year
mlfoundations / open_lm
A repository for research on medium sized language models.
☆515Updated 4 months ago
pacman100 / LLM-Workshop
LLM Workshop by Sourab Mangrulkar
☆394Updated last year
FranxYao / Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆477Updated last year
sabetAI / BLoRA
batched loras
☆347Updated 2 years ago
huggingface / datablations
Scaling Data-Constrained Language Models
☆342Updated 4 months ago
zeno-ml / zeno-build
Build, evaluate, understand, and fix LLM-based apps
☆491Updated last year
xfactlab / orpo
Official repository for ORPO
☆464Updated last year
lm-sys / llm-decontaminator
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
☆311Updated last year
VikParuchuri / textbook_quality
Generate textbook-quality synthetic LLM pretraining data
☆505Updated 2 years ago
DachengLi1 / LongChat
Official repository for LongChat and LongEval
☆531Updated last year
allenai / OLMo-Eval
Evaluation suite for LLMs
☆364Updated 3 months ago
llm-efficiency-challenge / neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
☆256Updated 2 years ago
ContextualAI / gritlm
Generative Representational Instruction Tuning
☆675Updated 4 months ago
salesforce / DialogStudio
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection and Instruction-Aware Models for Conversational AI
☆514Updated 9 months ago
tcapelle / llm_recipes
A set of scripts and notebooks on LLM finetunning and dataset creation
☆110Updated last year
yuchenlin / LLM-Blender
[ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…
☆967Updated last year
dzhulgakov / llama-mistral
Inference code for Mistral and Mixtral hacked up into original Llama implementation
☆368Updated last year
LudwigStumpp / llm-leaderboard
A joint community effort to create one central leaderboard for LLMs.
☆305Updated last year
salesforce / xgen
Salesforce open-source LLMs with 8k sequence length.
☆722Updated 8 months ago
zhilizju / Awesome-instruction-tuning
A curated list of awesome instruction tuning datasets, models, papers and repositories.
☆341Updated 2 years ago
nlpxucan / evol-instruct
☆274Updated 2 years ago
jondurbin / bagel
A bagel, with everything.
☆324Updated last year
JinjieNi / MixEval
The official evaluation suite and dynamic data release for MixEval.
☆250Updated 11 months ago
datamllab / LongLM
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
☆661Updated last year