HabanaAI / Gaudi-tutorials
Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://developer.habana.ai/
☆55Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for Gaudi-tutorials
- Qualcomm Cloud AI SDK (Platform and Apps) enable high performance deep learning inference on Qualcomm Cloud AI platforms delivering high …☆55Updated last month
- Large Language Model Text Generation Inference on Habana Gaudi☆27Updated last week
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆13Updated last month
- Reference models for Intel(R) Gaudi(R) AI Accelerator☆155Updated this week
- Collection of kernels written in Triton language☆69Updated 3 weeks ago
- SynapseAI Core is a reference implementation of the SynapseAI API running on Habana Gaudi☆37Updated last year
- QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference☆112Updated 8 months ago
- Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)☆153Updated this week
- Provides the examples to write and build Habana custom kernels using the HabanaTools☆18Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆43Updated this week
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆83Updated 3 months ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆35Updated 6 months ago
- Applied AI experiments and examples for PyTorch☆168Updated 3 weeks ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆11Updated last month
- [ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache☆243Updated last month
- Flexible simulator for mixed precision and format simulation of LLMs and vision transformers.☆43Updated last year
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆87Updated last month
- Setup and Installation Instructions for Habana binaries, docker image creation☆23Updated last month
- Fast Hadamard transform in CUDA, with a PyTorch interface☆111Updated 6 months ago
- An experimentation platform for LLM inference optimisation☆26Updated 2 months ago
- Nsight Systems in Docker☆17Updated 11 months ago
- ☆13Updated this week
- extensible collectives library in triton☆72Updated 2 months ago
- Code repo for the paper "SpinQuant LLM quantization with learned rotations"☆164Updated 2 weeks ago
- Cataloging released Triton kernels.☆138Updated 2 months ago
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆53Updated 8 months ago
- Penn CIS 5650 (GPU Programming and Architecture) Final Project☆25Updated 11 months ago
- ☆22Updated this week
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆38Updated 3 weeks ago