A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆464Mar 10, 2025Updated 11 months ago
Alternatives and similar repositories for triton-resources
Users that are interested in triton-resources are comparing it to the libraries listed below
Sorting:
- This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…☆441Feb 22, 2025Updated last year
- Cataloging released Triton kernels.☆296Sep 9, 2025Updated 6 months ago
- My submission for the GPUMODE/AMD fp8 mm challenge☆29Jun 4, 2025Updated 9 months ago
- Puzzles for learning Triton☆2,324Nov 18, 2024Updated last year
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆329Updated this week
- GPU Kernels☆221Apr 27, 2025Updated 10 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆201Jun 1, 2025Updated 9 months ago
- 100 days of building GPU kernels!☆575Apr 27, 2025Updated 10 months ago
- ☆32Jul 2, 2025Updated 8 months ago
- A 120-day CUDA learning plan covering daily concepts, exercises, pitfalls, and references (including “Programming Massively Parallel Proc…☆872Mar 29, 2025Updated 11 months ago
- Learnings and programs related to CUDA☆434Jun 29, 2025Updated 8 months ago
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆17Feb 9, 2026Updated last month
- Learn CUDA with PyTorch☆244Updated this week
- Triton Compiler related materials.☆42Jan 4, 2025Updated last year
- DeeperGEMM: crazy optimized version☆74May 5, 2025Updated 10 months ago
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆595Aug 12, 2025Updated 6 months ago
- GPU programming related news and material links☆2,010Sep 17, 2025Updated 5 months ago
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 6 months ago
- A bunch of kernels that might make stuff slower 😉☆75Mar 2, 2026Updated last week
- Build compute kernels and load them from the Hub.☆472Updated this week
- Efficient Triton Kernels for LLM Training☆6,189Updated this week
- EquiTriton is a project that seeks to implement high-performance kernels for commonly used building blocks in equivariant neural networks…☆68Dec 16, 2025Updated 2 months ago
- ☆417Apr 10, 2025Updated 10 months ago
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆215Updated this week
- ☆301Updated this week
- Fast low-bit matmul kernels in Triton☆436Feb 1, 2026Updated last month
- Write a fast kernel and see how you compare against the best humans and AI on gpumode.com☆78Updated this week
- Distributed Compiler based on Triton for Parallel Systems☆1,380Feb 13, 2026Updated 3 weeks ago
- 🚀 Efficient implementations of state-of-the-art linear attention models☆4,474Updated this week
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆107Jun 28, 2025Updated 8 months ago
- Collection of kernels written in Triton language☆181Jan 27, 2026Updated last month
- Minimalistic 4D-parallelism distributed training framework for education purpose☆2,104Aug 26, 2025Updated 6 months ago
- Fastest kernels written from scratch☆550Sep 18, 2025Updated 5 months ago
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆253May 6, 2025Updated 10 months ago
- Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels☆5,330Updated this week
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)☆836Updated this week
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆774Updated this week
- Material for gpu-mode lectures☆5,818Feb 1, 2026Updated last month
- ☆239Nov 24, 2025Updated 3 months ago