pytorch / torchftLinks

Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)

☆395

Alternatives and similar repositories for torchft

Users that are interested in torchft are comparing it to the libraries listed below

Sorting:

meta-pytorch / monarch
PyTorch Single Controller
☆393Updated last week
foundation-model-stack / fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…
☆265Updated last month
facebookresearch / spdl
Scalable and Performant Data Loading
☆296Updated last week
huggingface / kernels
Load compute kernels from the Hub
☆271Updated this week
BobMcDear / attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆575Updated last month
meta-pytorch / float8_experimental
This repository contains the experimental PyTorch native float8 training UX
☆224Updated last year
mobiusml / gemlite
Fast low-bit matmul kernels in Triton
☆357Updated this week
facebookresearch / HolisticTraceAnalysis
A library to analyze PyTorch traces.
☆406Updated 3 weeks ago
foundation-model-stack / foundation-model-stack
🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.
☆209Updated last week
meta-pytorch / applied-ai
Applied AI experiments and examples for PyTorch
☆295Updated 3 weeks ago
NVIDIA / kvpress
LLM KV cache compression made easy
☆604Updated this week
Dao-AILab / quack
A Quirky Assortment of CuTe Kernels
☆450Updated this week
gpu-mode / profiling-cuda-in-torch
☆167Updated last year
huggingface / picotron_tutorial
☆216Updated 6 months ago
imbue-ai / cluster-health
☆315Updated last year
Deep-Learning-Profiling-Tools / triton-viz
☆233Updated 3 weeks ago
gpu-mode / ring-attention
ring-attention experiments
☆150Updated 10 months ago
gpu-mode / triton-index
Cataloging released Triton kernels.
☆252Updated this week
MekkCyber / CutlassAcademy
A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS
☆219Updated 4 months ago
huggingface / gpu-fryer
Where GPUs get cooked 👩‍🍳🔥
☆281Updated this week
HazyResearch / Megakernels
kernels, of the mega variety
☆486Updated 3 months ago
changjonathanc / flex-nano-vllm
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
☆269Updated last month
siboehm / ShallowSpeed
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
☆139Updated last year
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆193Updated 3 months ago
NVIDIA / Star-Attention
Efficient LLM Inference over Long Sequences
☆391Updated 2 months ago
ScalingIntelligence / KernelBench
KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems
☆557Updated 2 weeks ago
pytorch / helion
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
☆289Updated this week
AI-Hypercomputer / JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…
☆374Updated 3 months ago
NVIDIA-NeMo / Run
A tool to configure, launch and manage your machine learning experiments.
☆187Updated this week
NVIDIA / nvidia-resiliency-ext
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …
☆216Updated this week