VikParuchuri / triton_tutorialLinks
Tutorials for Triton, a language for writing gpu kernels
☆73Updated 2 years ago
Alternatives and similar repositories for triton_tutorial
Users that are interested in triton_tutorial are comparing it to the libraries listed below
Sorting:
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆198Updated 8 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆186Updated 3 weeks ago
- ☆92Updated last year
- Load compute kernels from the Hub☆397Updated this week
- ☆178Updated 2 years ago
- Experiment of using Tangent to autodiff triton☆82Updated 2 years ago
- Accelerated First Order Parallel Associative Scan☆196Updated last month
- Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"☆86Updated 4 months ago
- ☆232Updated 2 months ago
- ☆147Updated this week
- Normalized Transformer (nGPT)☆198Updated last year
- This repository contains the experimental PyTorch native float8 training UX☆227Updated last year
- Understand and test language model architectures on synthetic tasks.☆252Updated 3 weeks ago
- Code for studying the super weight in LLM☆121Updated last year
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆280Updated 2 months ago
- Annotated version of the Mamba paper☆496Updated last year
- ☆124Updated last year
- Fast and memory-efficient exact attention☆75Updated 11 months ago
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.☆92Updated 6 months ago
- supporting pytorch FSDP for optimizers☆84Updated last year
- Awesome Triton Resources☆39Updated 9 months ago
- ☆83Updated 2 years ago
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆334Updated 3 months ago
- EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)☆24Updated last month
- The evaluation framework for training-free sparse attention in LLMs☆117Updated 2 weeks ago
- ☆236Updated last year
- A bunch of kernels that might make stuff slower 😉☆75Updated this week
- Custom triton kernels for training Karpathy's nanoGPT.☆19Updated last year
- ring-attention experiments☆165Updated last year
- Cataloging released Triton kernels.☆292Updated 5 months ago