VikParuchuri / triton_tutorialLinks
Tutorials for Triton, a language for writing gpu kernels
☆57Updated 2 years ago
Alternatives and similar repositories for triton_tutorial
Users that are interested in triton_tutorial are comparing it to the libraries listed below
Sorting:
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆196Updated 6 months ago
- ☆91Updated last year
- Accelerated First Order Parallel Associative Scan☆192Updated last year
- Experiment of using Tangent to autodiff triton☆80Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆174Updated 5 months ago
- ☆177Updated last year
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆313Updated last month
- Flash-Muon: An Efficient Implementation of Muon Optimizer☆212Updated 5 months ago
- Understand and test language model architectures on synthetic tasks.☆240Updated 2 months ago
- Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"☆85Updated 2 months ago
- ☆222Updated 11 months ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Updated last year
- A bunch of kernels that might make stuff slower 😉☆65Updated this week
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.☆91Updated 4 months ago
- Normalized Transformer (nGPT)☆194Updated last year
- ☆38Updated last year
- Code for studying the super weight in LLM☆121Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆135Updated 11 months ago
- ☆83Updated 2 years ago
- supporting pytorch FSDP for optimizers☆84Updated 11 months ago
- The evaluation framework for training-free sparse attention in LLMs☆106Updated last month
- Annotated version of the Mamba paper☆491Updated last year
- An extension of the nanoGPT repository for training small MOE models.☆215Updated 8 months ago
- ☆224Updated last week
- ☆28Updated 2 months ago
- Load compute kernels from the Hub☆337Updated last week
- ☆121Updated last year
- Fast and memory-efficient exact attention☆74Updated 9 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆100Updated last year
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆243Updated 5 months ago