LambdaLabsML / distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
☆279Updated last week
Related projects ⓘ
Alternatives and complementary repositories for distributed-training-guide
- Website for hosting the Open Foundation Models Cheat Sheet.☆255Updated 4 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆203Updated 2 weeks ago
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆479Updated 2 weeks ago
- Implementation of Diffusion Transformer (DiT) in JAX☆252Updated 5 months ago
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆711Updated last month
- Fast bare-bones BPE for modern tokenizer training☆142Updated 3 weeks ago
- System 2 Reasoning Link Collection☆686Updated 2 weeks ago
- For optimization algorithm research and development.☆417Updated this week
- LoRA and DoRA from Scratch Implementations☆188Updated 8 months ago
- Annotated version of the Mamba paper☆455Updated 8 months ago
- UNet diffusion model in pure CUDA☆567Updated 4 months ago
- ☆292Updated 4 months ago
- ☆139Updated 2 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆237Updated 4 months ago
- code for training & evaluating Contextual Document Embedding models☆93Updated this week
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆194Updated 6 months ago
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆191Updated 3 weeks ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆251Updated last year
- What would you do with 1000 H100s...☆895Updated 10 months ago
- ☆223Updated 4 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆92Updated last week
- ☆133Updated 9 months ago
- A bibliography and survey of the papers surrounding o1☆643Updated this week
- NanoGPT (124M) quality in 7.8 8xH100-minutes☆965Updated this week
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆219Updated last week
- An Open Source Toolkit For LLM Distillation☆352Updated last month
- Textbook on reinforcement learning from human feedback☆74Updated last week
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆172Updated 3 months ago
- The Tensor (or Array)☆408Updated 3 months ago
- Puzzles for exploring transformers☆323Updated last year