Lightning-Universe / lightning-HivemindLinks
Lightning Training strategy for HiveMind
☆18Updated last week
Alternatives and similar repositories for lightning-Hivemind
Users that are interested in lightning-Hivemind are comparing it to the libraries listed below
Sorting:
- Example of applying CUDA graphs to LLaMA-v2☆12Updated 2 years ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆66Updated this week
- Experiment of using Tangent to autodiff triton☆81Updated last year
- A block oriented training approach for inference time optimization.☆34Updated last year
- Memory Optimizations for Deep Learning (ICML 2023)☆114Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆46Updated last year
- Ship correct and fast LLM kernels to PyTorch☆130Updated this week
- Work in progress.☆77Updated last month
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆110Updated 2 years ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆228Updated this week
- ☆115Updated last year
- Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"☆148Updated 2 years ago
- ☆71Updated 9 months ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆162Updated 3 weeks ago
- This repository contains the experimental PyTorch native float8 training UX☆227Updated last year
- train with kittens!☆63Updated last year
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …☆60Updated last year
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆16Updated last year
- ☆38Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆94Updated this week
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of…☆146Updated last year
- QuIP quantization☆61Updated last year
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆218Updated this week
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆64Updated 11 months ago
- Boosting 4-bit inference kernels with 2:4 Sparsity☆90Updated last year
- PB-LLM: Partially Binarized Large Language Models☆157Updated 2 years ago
- Make triton easier☆50Updated last year
- ☆160Updated 2 years ago
- Repository for CPU Kernel Generation for LLM Inference☆27Updated 2 years ago
- Sparsity support for PyTorch☆38Updated 9 months ago