meta-pytorch / torchftLinks
Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)
β420Updated this week
Alternatives and similar repositories for torchft
Users that are interested in torchft are comparing it to the libraries listed below
Sorting:
- PyTorch Single Controllerβ438Updated last week
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β270Updated 2 months ago
- Scalable and Performant Data Loadingβ308Updated last week
- Load compute kernels from the Hubβ304Updated this week
- This repository contains the experimental PyTorch native float8 training UXβ223Updated last year
- Fast low-bit matmul kernels in Tritonβ381Updated 3 weeks ago
- LLM KV cache compression made easyβ660Updated last week
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.β578Updated 2 months ago
- A Quirky Assortment of CuTe Kernelsβ627Updated last week
- Applied AI experiments and examples for PyTorchβ299Updated 2 months ago
- π Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.β215Updated this week
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.β389Updated this week
- β174Updated last year
- A library to analyze PyTorch traces.β416Updated last week
- kernels, of the mega varietyβ586Updated 3 weeks ago
- ring-attention experimentsβ154Updated last year
- β316Updated last year
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.β145Updated 2 years ago
- β222Updated 3 weeks ago
- ArcticInference: vLLM plugin for high-throughput, low-latency inferenceβ283Updated this week
- β218Updated 9 months ago
- β240Updated this week
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.β296Updated 2 months ago
- Cataloging released Triton kernels.β263Updated last month
- π· Build compute kernelsβ163Updated this week
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the β¦β226Updated this week
- Efficient LLM Inference over Long Sequencesβ390Updated 3 months ago
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASSβ233Updated 5 months ago
- Perplexity GPU Kernelsβ497Updated last month
- Where GPUs get cooked π©βπ³π₯β293Updated last month