Lightning-AI / lightning-thunder
Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
☆1,257Updated this week
Alternatives and similar repositories for lightning-thunder:
Users that are interested in lightning-thunder are comparing it to the libraries listed below
- PyTorch native quantization and sparsity for training and inference☆1,753Updated this week
- A PyTorch native library for large model training☆3,091Updated this week
- Puzzles for learning Triton☆1,300Updated last month
- Transform datasets at scale. Optimize datasets for fast AI model training.☆396Updated this week
- Minimalistic large language model 3D-parallelism training☆1,386Updated this week
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆505Updated 2 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆644Updated this week
- Schedule-Free Optimization in PyTorch☆2,061Updated last month
- Tile primitives for speedy kernels☆1,923Updated this week
- A simple, performant and scalable Jax LLM!☆1,587Updated this week
- What would you do with 1000 H100s...☆948Updated last year
- Training LLMs with QLoRA + FSDP☆1,436Updated 2 months ago
- UNet diffusion model in pure CUDA☆596Updated 6 months ago
- A modern model graph visualizer and debugger☆1,098Updated this week
- NanoGPT (124M) in 3.4 minutes☆2,068Updated last week
- TensorDict is a pytorch dedicated tensor container.☆862Updated this week
- A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs…☆2,086Updated this week
- A pytorch quantization backend for optimum☆865Updated last week
- Official implementation of Half-Quadratic Quantization (HQQ)☆732Updated this week
- For optimization algorithm research and development.☆484Updated this week
- A JAX research toolkit for building, editing, and visualizing neural networks.☆1,714Updated last month
- ☆913Updated this week
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection☆1,481Updated 2 months ago
- Helpful tools and examples for working with flex-attention☆583Updated this week
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,147Updated last week
- Pipeline Parallelism for PyTorch☆736Updated 4 months ago
- GPU programming related news and material links☆1,312Updated last week
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆752Updated this week
- FlashInfer: Kernel Library for LLM Serving☆1,797Updated this week