meta-pytorch / monarchLinks
PyTorch Single Controller
☆905Updated this week
Alternatives and similar repositories for monarch
Users that are interested in monarch are comparing it to the libraries listed below
Sorting:
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆455Updated 2 weeks ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆633Updated last week
- PyTorch-native post-training at scale☆546Updated last week
- Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs☆700Updated this week
- Where GPUs get cooked 👩🍳🔥☆319Updated 2 months ago
- ☆912Updated 3 weeks ago
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA (+ more DSLs)☆676Updated last week
- torchcomms: a modern PyTorch communications API☆295Updated this week
- A library to analyze PyTorch traces.☆436Updated last week
- LLM KV cache compression made easy☆694Updated last week
- Scalable and Performant Data Loading☆345Updated this week
- Dion optimizer algorithm☆388Updated last week
- A Quirky Assortment of CuTe Kernels☆675Updated last week
- Load compute kernels from the Hub☆335Updated this week
- 👷 Build compute kernels☆186Updated this week
- kernels, of the mega variety☆614Updated 2 months ago
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆583Updated 3 months ago
- Perplexity GPU Kernels☆531Updated 3 weeks ago
- Simple MPI implementation for prototyping or learning☆289Updated 3 months ago
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆305Updated 3 weeks ago
- ☆72Updated 9 months ago
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆271Updated this week
- Perplexity open source garden for inference technology☆274Updated last week
- Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.☆404Updated this week
- ☆233Updated 5 months ago
- ☆546Updated last year
- ☆250Updated last week
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆138Updated 2 months ago
- NVIDIA Inference Xfer Library (NIXL)☆729Updated this week
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆235Updated last week