Lightning-AI / lightning-Habana
Lightning support for Intel Habana accelerators.
☆25Updated 2 weeks ago
Related projects: ⓘ
- Implementation of Infini-Transformer in Pytorch☆100Updated last month
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆58Updated this week
- Elixir: Train a Large Language Model on a Small GPU Cluster☆12Updated last year
- ☆66Updated 3 months ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆43Updated last year
- DAM Data Acquisition for ML Benchmark, as part of the DataPerf benchmark suite, https://dataperf.org/☆22Updated last year
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"☆57Updated 5 months ago
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆35Updated 2 years ago
- JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)☆28Updated 4 months ago
- ☆26Updated last year
- **ARCHIVED** Filesystem interface to 🤗 Hub☆56Updated last year
- some common Huggingface transformers in maximal update parametrization (µP)☆76Updated 2 years ago
- Awesome Triton Resources☆16Updated 3 weeks ago
- Randomized Positional Encodings Boost Length Generalization of Transformers☆78Updated 6 months ago
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆58Updated last year
- GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU☆17Updated 3 weeks ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆56Updated 10 months ago
- Experiment of using Tangent to autodiff triton☆66Updated 7 months ago
- Official repository of Sparse ISO-FLOP Transformations for Maximizing Training Efficiency☆23Updated last month
- JAX/Flax implementation of the Hyena Hierarchy☆29Updated last year
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆77Updated last year
- ☆29Updated last year
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆101Updated last year
- Simple and efficient pytorch-native transformer training and inference (batched)☆53Updated 5 months ago
- Prototype routines for GPU quantization written using PyTorch.☆19Updated 6 months ago
- Triton Implementation of HyperAttention Algorithm☆46Updated 9 months ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆115Updated last month
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆34Updated 2 months ago
- ☆91Updated this week
- Utilities for Training Very Large Models☆56Updated last week