Lightning-AI / litData
Transform datasets at scale. Optimize datasets for fast AI model training.
☆472Updated this week
Alternatives and similar repositories for litData
Users that are interested in litData are comparing it to the libraries listed below
Sorting:
- Scalable and Performant Data Loading☆258Updated this week
- Thunder gives you PyTorch models superpowers for training and inference. Unlock out-of-the-box optimizations for performance, memory and …☆1,342Updated this week
- For optimization algorithm research and development.☆513Updated this week
- TensorDict is a pytorch dedicated tensor container.☆925Updated this week
- Helpful tools and examples for working with flex-attention☆766Updated last week
- A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to fac…☆229Updated 3 months ago
- Universal Tensor Operations in Einstein-Inspired Notation for Python.☆369Updated last month
- Best practices & guides on how to write distributed pytorch training code☆418Updated 2 months ago
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆536Updated this week
- PyTorch per step fault tolerance (actively under development)☆300Updated this week
- PyTorch native quantization and sparsity for training and inference☆2,041Updated this week
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆512Updated 6 months ago
- ☆302Updated 10 months ago
- A repository for research on medium sized language models.☆495Updated last week
- Common Python utilities and GitHub Actions in Lightning Ecosystem☆56Updated this week
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆245Updated this week
- PyTorch video decoding☆545Updated this week
- Annotated version of the Mamba paper☆483Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆569Updated this week
- A pytorch quantization backend for optimum☆935Updated 3 weeks ago
- Muon optimizer: +>30% sample efficiency with <3% wallclock overhead☆623Updated last month
- Pipeline Parallelism for PyTorch☆765Updated 8 months ago
- A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.☆1,200Updated this week
- Website for hosting the Open Foundation Models Cheat Sheet.☆267Updated last week
- PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.☆790Updated 3 months ago
- TorchX is a universal job launcher for PyTorch applications. TorchX is designed to have fast iteration time for training/research and sup…☆362Updated this week
- ☆347Updated last week
- Scalable data pre processing and curation toolkit for LLMs☆910Updated this week
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆157Updated last year
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆871Updated 2 weeks ago