pytorch / torchft
PyTorch per step fault tolerance (actively under development)
☆266Updated this week
Alternatives and similar repositories for torchft:
Users that are interested in torchft are comparing it to the libraries listed below
- Scalable and Performant Data Loading☆230Updated this week
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆524Updated last month
- This repository contains the experimental PyTorch native float8 training UX☆222Updated 7 months ago
- ☆203Updated last month
- Applied AI experiments and examples for PyTorch☆249Updated this week
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆232Updated 2 weeks ago
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems☆234Updated this week
- Fast low-bit matmul kernels in Triton☆267Updated this week
- Efficient LLM Inference over Long Sequences☆365Updated last month
- LLM KV cache compression made easy☆440Updated this week
- Cataloging released Triton kernels.☆204Updated 2 months ago
- ring-attention experiments☆127Updated 5 months ago
- Collection of kernels written in Triton language☆114Updated last month
- A tool to configure, launch and manage your machine learning experiments.☆129Updated this week
- ☆191Updated this week
- ☆158Updated last month
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆102Updated this week
- Transform datasets at scale. Optimize datasets for fast AI model training.☆426Updated this week
- Helpful tools and examples for working with flex-attention☆695Updated this week
- TorchFix - a linter for PyTorch-using code with autofix support☆136Updated last month
- ☆290Updated this week
- ☆151Updated last year
- Fast Matrix Multiplications for Lookup Table-Quantized LLMs☆232Updated 3 weeks ago
- extensible collectives library in triton☆84Updated 5 months ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆255Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆262Updated 5 months ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆155Updated 3 months ago