huggingface / picotronLinks

Minimalistic 4D-parallelism distributed training framework for education purpose

☆1,863

Alternatives and similar repositories for picotron

Users that are interested in picotron are comparing it to the libraries listed below

Sorting:

huggingface / nanotron
Minimalistic large language model 3D-parallelism training
☆2,274Updated last month
policy-gradient / GRPO-Zero
Implementing DeepSeek R1's GRPO algorithm from scratch
☆1,638Updated 6 months ago
srush / Triton-Puzzles
Puzzles for learning Triton
☆2,074Updated 11 months ago
KellerJordan / modded-nanogpt
NanoGPT (124M) in 3 minutes
☆3,694Updated last week
karpathy / nano-llama31
nanoGPT style version of Llama 3.1
☆1,438Updated last year
pytorch / torchtitan
A PyTorch native platform for training generative AI models
☆4,604Updated this week
srush / LLM-Training-Puzzles
What would you do with 1000 H100s...
☆1,118Updated last year
natolambert / rlhf-book
Textbook on reinforcement learning from human feedback
☆1,279Updated this week
LambdaLabsML / distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
☆517Updated this week
EleutherAI / cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
☆820Updated 2 months ago
McGill-NLP / nano-aha-moment
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
☆538Updated 3 weeks ago
HazyResearch / ThunderKittens
Tile primitives for speedy kernels
☆2,838Updated last week
gpu-mode / resource-stream
GPU programming related news and material links
☆1,746Updated last month
MoonshotAI / Moonlight
Muon is Scalable for LLM Training
☆1,342Updated 2 months ago
pytorch / ao
PyTorch native quantization and sparsity for training and inference
☆2,438Updated last week
fla-org / flash-linear-attention
🚀 Efficient implementations of state-of-the-art linear attention models
☆3,552Updated this week
facebookresearch / blt
Code for BLT research paper
☆1,999Updated 5 months ago
meta-pytorch / attention-gym
Helpful tools and examples for working with flex-attention
☆1,029Updated this week
srush / awesome-o1
A bibliography and survey of the papers surrounding o1
☆1,209Updated 11 months ago
huggingface / lighteval
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
☆2,021Updated last week
jax-ml / scaling-book
Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs
☆666Updated this week
KellerJordan / Muon
Muon is an optimizer for hidden layers in neural networks
☆1,934Updated 3 months ago
HazyResearch / aisys-building-blocks
Building blocks for foundation models.
☆567Updated last year
thinking-machines-lab / batch_invariant_ops
☆843Updated 2 weeks ago
THUDM / slime
slime is an LLM post-training framework for RL Scaling.
☆2,232Updated this week
huggingface / search-and-learn
Recipes to scale inference-time compute of open models
☆1,114Updated 5 months ago
gpu-mode / awesomeMLSys
An ML Systems Onboarding list
☆917Updated 9 months ago
NVIDIA-NeMo / RL
Scalable toolkit for efficient model reinforcement
☆956Updated this week
huggingface / nanoVLM
The simplest, fastest repository for training/finetuning small-sized VLMs.
☆4,159Updated last week
vllm-project / llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
☆2,149Updated this week