Lightning-AI / litdata

Transform datasets at scale. Optimize datasets for fast AI model training.

☆438

Alternatives and similar repositories for litdata:

Users that are interested in litdata are comparing it to the libraries listed below

facebookresearch / spdl
Scalable and Performant Data Loading
☆231Updated this week
pytorch / torcheval
A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to fac…
☆229Updated 2 months ago
pytorch / tensordict
TensorDict is a pytorch dedicated tensor container.
☆905Updated this week
Lightning-AI / lightning-thunder
Thunder gives you PyTorch models superpowers for training and inference. Unlock out-of-the-box optimizations for performance, memory and …
☆1,314Updated this week
facebookresearch / optimizers
For optimization algorithm research and development.
☆502Updated this week
fferflo / einx
Universal Tensor Operations in Einstein-Inspired Notation for Python.
☆364Updated last month
pytorch-labs / attention-gym
Helpful tools and examples for working with flex-attention
☆701Updated 2 weeks ago
BobMcDear / attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆526Updated last month
pytorch / torchft
PyTorch per step fault tolerance (actively under development)
☆270Updated this week
pytorch-labs / torchfix
TorchFix - a linter for PyTorch-using code with autofix support
☆137Updated last month
huggingface / optimum-quanto
A pytorch quantization backend for optimum
☆910Updated 3 weeks ago
pytorch / PiPPy
Pipeline Parallelism for PyTorch
☆761Updated 7 months ago
Lightning-AI / utilities
Common Python utilities and GitHub Actions in Lightning Ecosystem
☆54Updated this week
google / fiddle
☆345Updated this week
lucidrains / ring-attention-pytorch
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
☆509Updated 5 months ago
srush / annotated-mamba
Annotated version of the Mamba paper
☆478Updated last year
google / grain
Library for reading and processing ML training data.
☆414Updated last week
BlackHC / toma
Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory
☆435Updated 7 months ago
microsoft / Samba
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆857Updated last month
pytorch / ao
PyTorch native quantization and sparsity for training and inference
☆1,927Updated this week
pytorch / torchx
TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…
☆356Updated this week
pytorch / torchcodec
PyTorch video decoding
☆477Updated this week
huggingface / picotron
Minimalistic 4D-parallelism distributed training framework for education purpose
☆970Updated 3 weeks ago
triton-inference-server / pytriton
PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
☆785Updated last month
foundation-model-stack / fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…
☆234Updated this week
NVIDIA-Merlin / dataloader
The merlin dataloader lets you rapidly load tabular data for training deep leaning models with TensorFlow, PyTorch or JAX
☆417Updated 11 months ago
apple / ml-sigma-reparam
☆302Updated 9 months ago
KellerJordan / Muon
Muon optimizer: +>30% sample efficiency with <3% wallclock overhead
☆539Updated last week
stanford-crfm / levanter
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆562Updated this week
patrick-kidger / jaxtyping
Type annotations and runtime checking for shape and dtype of JAX/NumPy/PyTorch/etc. arrays. https://docs.kidger.site/jaxtyping/
☆1,356Updated last week