Lightning-AI / lightning-HabanaLinks

Lightning support for Intel Habana accelerators.

☆27

Alternatives and similar repositories for lightning-Habana

Users that are interested in lightning-Habana are comparing it to the libraries listed below

Sorting:

mgmalek / efficient_cross_entropy
☆108Updated last year
aniquetahir / JORA
JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)
☆33Updated last year
amirzandieh / HyperAttention
Triton Implementation of HyperAttention Algorithm
☆48Updated last year
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆80Updated 3 years ago
daemyung / practice-triton
삼각형의 실전! Triton
☆16Updated last year
Edward-Sun / gpt-accelera
Simple and efficient pytorch-native transformer training and inference (batched)
☆75Updated last year
google-deepmind / asyncdiloco
☆44Updated last year
Doraemonzzz / Awesome-Triton-Resources
Awesome Triton Resources
☆28Updated last month
huggingface / peft-pytorch-conference
Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…
☆14Updated last year
google-deepmind / randomized_positional_encodings
Randomized Positional Encodings Boost Length Generalization of Transformers
☆82Updated last year
SeunghyunSEO / optimized_hf_llama_class_for_training
☆47Updated 9 months ago
siyan-zhao / prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …
☆59Updated 7 months ago
srush / triton-autodiff
Experiment of using Tangent to autodiff triton
☆79Updated last year
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆111Updated 5 months ago
insuhan / hyper-attn
☆81Updated last year
huggingface / kernels
Load compute kernels from the Hub
☆139Updated this week
PiotrNawrot / nano-sparse-attention
The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.
☆62Updated 4 months ago
lucidrains / tableformer-pytorch
Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch
☆39Updated 3 years ago
mayank31398 / ladder-residual-inference
☆13Updated 3 weeks ago
pytorch-labs / superblock
A block oriented training approach for inference time optimization.
☆33Updated 9 months ago
PiotrNawrot / sparse-frontier
Official implementation of "The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs"
☆32Updated last month
epfml / DenseFormer
☆80Updated last year
snowflakedb / ArcticTraining
ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)
☆105Updated this week
IST-DASLab / SparseFinetuning
Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry
☆42Updated last year
ronghanghu / vit_10b_fsdp_example
See details in https://github.com/pytorch/xla/blob/r1.12/torch_xla/distributed/fsdp/README.md
☆24Updated 2 years ago
facebookexperimental / protoquant
Prototype routines for GPU quantization written using PyTorch.
☆21Updated 3 months ago
mnoukhov / async_rlhf
Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models
☆54Updated last month
lucidrains / autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
☆44Updated 2 years ago
yixiaoer / tpu-training-example
☆14Updated 10 months ago
srush / mamba-primer
☆37Updated last year