facebookresearch / spdl
Scalable and Performant Data Loading
β222Updated this week
Alternatives and similar repositories for spdl:
Users that are interested in spdl are comparing it to the libraries listed below
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β225Updated this week
- PyTorch per step fault tolerance (actively under development)β253Updated last week
- This repository contains the experimental PyTorch native float8 training UXβ221Updated 7 months ago
- Helpful tools and examples for working with flex-attentionβ662Updated last week
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.β515Updated last week
- For optimization algorithm research and development.β497Updated this week
- Transform datasets at scale. Optimize datasets for fast AI model training.β417Updated last week
- An implementation of PSGD Kron second-order optimizer for PyTorchβ84Updated last week
- PyTorch video decodingβ250Updated this week
- Efficient optimizersβ177Updated last week
- A library for unit scaling in PyTorchβ123Updated 3 months ago
- Implementation of a Transformer, but completely in Tritonβ259Updated 2 years ago
- Muon optimizer: +>30% sample efficiency with <3% wallclock overheadβ434Updated this week
- Implementation of π Ring Attention, from Liu et al. at Berkeley AI, in Pytorchβ503Updated 4 months ago
- β91Updated 9 months ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.β156Updated 11 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"β218Updated last month
- Google TPU optimizations for transformers modelsβ100Updated last month
- β142Updated 2 weeks ago
- Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizersβ84Updated 7 months ago
- β301Updated 8 months ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of newβ¦β120Updated 7 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingβ122Updated 10 months ago
- supporting pytorch FSDP for optimizersβ77Updated 2 months ago
- Understand and test language model architectures on synthetic tasks.β183Updated last month
- DeMo: Decoupled Momentum Optimizationβ181Updated 3 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jaxβ550Updated this week
- A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to facβ¦β227Updated last month
- Accelerated First Order Parallel Associative Scanβ172Updated 6 months ago