bkitano / llama-from-scratchLinks
Llama from scratch, or How to implement a paper without crying
☆567Updated last year
Alternatives and similar repositories for llama-from-scratch
Users that are interested in llama-from-scratch are comparing it to the libraries listed below
Sorting:
- Minimalistic large language model 3D-parallelism training☆1,942Updated this week
- nanoGPT style version of Llama 3.1☆1,386Updated 10 months ago
- LLaMA 2 implemented from scratch in PyTorch☆336Updated last year
- LoRA and DoRA from Scratch Implementations☆205Updated last year
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆802Updated 2 weeks ago
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,004Updated 10 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,554Updated 3 weeks ago
- ☆1,229Updated 4 months ago
- What would you do with 1000 H100s...☆1,056Updated last year
- Best practices for distilling large language models.☆554Updated last year
- LLM Workshop by Sourab Mangrulkar☆383Updated last year
- Fine-tune mistral-7B on 3090s, a100s, h100s☆713Updated last year
- From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)☆723Updated 7 months ago
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆883Updated last month
- distributed trainer for LLMs☆577Updated last year
- An open collection of methodologies to help with successful training of large language models.☆496Updated last year
- ☆520Updated 7 months ago
- Stanford NLP Python library for Representation Finetuning (ReFT)☆1,490Updated 4 months ago
- Puzzles for exploring transformers☆350Updated 2 years ago
- Building blocks for foundation models.☆510Updated last year
- Official repository for ORPO☆455Updated last year
- Fast bare-bones BPE for modern tokenizer training☆158Updated this week
- Code repository for the paper - "Matryoshka Representation Learning"☆507Updated last year
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆476Updated last month
- UNet diffusion model in pure CUDA☆608Updated last year
- An extension of the nanoGPT repository for training small MOE models.☆152Updated 3 months ago
- Best practices & guides on how to write distributed pytorch training code☆441Updated 4 months ago
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,258Updated 3 months ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,386Updated last year
- Code for fine-tuning Platypus fam LLMs using LoRA☆629Updated last year