huggingface / picotronLinks
Minimalistic 4D-parallelism distributed training framework for education purpose
☆1,505Updated 2 months ago
Alternatives and similar repositories for picotron
Users that are interested in picotron are comparing it to the libraries listed below
Sorting:
- Minimalistic large language model 3D-parallelism training☆1,888Updated last week
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,372Updated last month
- NanoGPT (124M) in 3 minutes☆2,600Updated this week
- Puzzles for learning Triton☆1,658Updated 6 months ago
- nanoGPT style version of Llama 3.1☆1,373Updated 9 months ago
- A PyTorch native platform for training generative AI models☆3,838Updated this week
- Recipes to scale inference-time compute of open models☆1,073Updated last week
- What would you do with 1000 H100s...☆1,048Updated last year
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆793Updated last month
- A bibliography and survey of the papers surrounding o1☆1,193Updated 6 months ago
- UNet diffusion model in pure CUDA☆605Updated 11 months ago
- The Multilayer Perceptron Language Model☆549Updated 9 months ago
- Tile primitives for speedy kernels☆2,399Updated this week
- The simplest, fastest repository for training/finetuning small-sized VLMs.☆3,003Updated this week
- Muon optimizer: +>30% sample efficiency with <3% wallclock overhead☆661Updated this week
- GPU programming related news and material links☆1,527Updated 4 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆1,563Updated last week
- Best practices & guides on how to write distributed pytorch training code☆427Updated 3 months ago
- PyTorch native quantization and sparsity for training and inference☆2,064Updated this week
- Muon is Scalable for LLM Training☆1,049Updated 2 months ago
- The Autograd Engine☆607Updated 8 months ago
- FlashInfer: Kernel Library for LLM Serving☆3,044Updated this week
- Building blocks for foundation models.☆500Updated last year
- An ML Systems Onboarding list☆789Updated 4 months ago
- Code for BLT research paper☆1,664Updated last week
- Flash Attention in ~100 lines of CUDA (forward pass only)☆827Updated 5 months ago
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆874Updated last month
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM☆1,417Updated this week
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆460Updated this week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,120Updated 4 months ago