VikParuchuri / triton_tutorialLinks
Tutorials for Triton, a language for writing gpu kernels
☆55Updated 2 years ago
Alternatives and similar repositories for triton_tutorial
Users that are interested in triton_tutorial are comparing it to the libraries listed below
Sorting:
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆193Updated 4 months ago
- Load compute kernels from the Hub☆304Updated this week
- ☆91Updated last year
- ☆209Updated 9 months ago
- Experiment of using Tangent to autodiff triton☆80Updated last year
- An extension of the nanoGPT repository for training small MOE models.☆202Updated 7 months ago
- ☆83Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆166Updated 3 months ago
- ☆174Updated last year
- Code for studying the super weight in LLM☆120Updated 10 months ago
- Flash-Muon: An Efficient Implementation of Muon Optimizer☆195Updated 4 months ago
- Understand and test language model architectures on synthetic tasks.☆233Updated 3 weeks ago
- ☆121Updated last year
- ☆222Updated 3 weeks ago
- Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"☆85Updated last month
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆296Updated 2 months ago
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.☆91Updated 3 months ago
- Annotated version of the Mamba paper☆489Updated last year
- Cataloging released Triton kernels.☆263Updated last month
- supporting pytorch FSDP for optimizers☆83Updated 10 months ago
- A bunch of kernels that might make stuff slower 😉☆62Updated this week
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆270Updated 2 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆248Updated 8 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆99Updated last year
- Normalized Transformer (nGPT)☆192Updated 11 months ago
- Accelerated First Order Parallel Associative Scan☆189Updated last year
- Prune transformer layers☆69Updated last year
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.☆73Updated 2 weeks ago
- Collection of kernels written in Triton language☆157Updated 6 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆78Updated last year