jax-ml / scaling-bookLinks

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

☆440

Alternatives and similar repositories for scaling-book

Users that are interested in scaling-book are comparing it to the libraries listed below

Sorting:

jax-ml / jax-llm-examples
☆137Updated last week
google-deepmind / nanodo
☆274Updated last year
rwitten / HighPerfLLMs2024
☆516Updated last year
stanford-crfm / levanter
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆627Updated this week
marin-community / marin
☆336Updated this week
pytorch-labs / monarch
PyTorch Single Controller
☆341Updated this week
siboehm / ShallowSpeed
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
☆137Updated last year
srush / Autodiff-Puzzles
☆443Updated 9 months ago
MatX-inc / seqax
seqax = sequence modeling + JAX
☆165Updated last week
modula-systems / modula
🧱 Modula software package
☆210Updated this week
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆188Updated 2 months ago
huggingface / picotron_tutorial
☆203Updated 5 months ago
NVIDIA / JAX-Toolbox
JAX-Toolbox
☆327Updated this week
jax-ml / jax-triton
jax-triton contains integrations between JAX and OpenAI Triton
☆411Updated last month
mlcommons / algorithmic-efficiency
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…
☆389Updated this week
BobMcDear / attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆563Updated 2 weeks ago
HazyResearch / aisys-building-blocks
Building blocks for foundation models.
☆519Updated last year
kvfrans / jax-diffusion-transformer
Implementation of Diffusion Transformer (DiT) in JAX
☆280Updated last year
Quentin-Anthony / nanoMPI
Simple MPI implementation for prototyping or learning
☆269Updated last week
facebookresearch / optimizers
For optimization algorithm research and development.
☆525Updated this week
google / tunix
A JAX-native LLM Post-Training Library
☆76Updated this week
young-geng / scalax
A simple library for scaling up JAX programs
☆140Updated 9 months ago
srush / Transformer-Puzzles
Puzzles for exploring transformers
☆355Updated 2 years ago
stanford-crfm / haliax
Named Tensors for Legible Deep Learning in JAX
☆194Updated this week
pytorch / torchft
Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)
☆366Updated last week
LambdaLabsML / distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
☆460Updated 5 months ago
KellerJordan / cifar10-airbench
CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds
☆274Updated 2 weeks ago
srush / annotated-mamba
Annotated version of the Mamba paper
☆487Updated last year
gpu-mode / profiling-cuda-in-torch
☆162Updated last year
google / orbax
Orbax provides common checkpointing and persistence utilities for JAX users
☆410Updated this week