rkinas/triton-resources

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rkinas/triton-resources)

rkinas / triton-resources

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

☆496

Alternatives and similar repositories for triton-resources

Users that are interested in triton-resources are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rkinas / cuda-learning
View on GitHub
This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…
☆462Feb 22, 2025Updated last year
gpu-mode / triton-index
View on GitHub
Cataloging released Triton kernels.
☆310Sep 9, 2025Updated 10 months ago
Snektron / gpumode-amd-fp8-mm
View on GitHub
My submission for the GPUMODE/AMD fp8 mm challenge
☆29Jun 4, 2025Updated last year
gpu-mode / Triton-Puzzles
View on GitHub
Puzzles for learning Triton
☆2,545Apr 1, 2026Updated 3 months ago
meta-pytorch / tritonbench
View on GitHub
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
☆363Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
1y33 / 100Days
View on GitHub
GPU Kernels
☆225Apr 27, 2025Updated last year
a-hamdi / GPU
View on GitHub
100 days of building GPU kernels!
☆617Apr 27, 2025Updated last year
Maharshi-Pandya / cudacodes
View on GitHub
Learnings and programs related to CUDA
☆440Jun 29, 2025Updated last year
AdepojuJeremy / CUDA-120-DAYS--CHALLENGE
View on GitHub
A 120-day CUDA learning plan covering daily concepts, exercises, pitfalls, and references (including “Programming Massively Parallel Proc…
☆941Mar 29, 2025Updated last year
MekkCyber / TritonAcademy
View on GitHub
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆208Jun 1, 2025Updated last year
dropbox / gemlite
View on GitHub
Fast low-bit matmul kernels in Triton
☆477Jul 15, 2026Updated 2 weeks ago
gpu-mode / resource-stream
View on GitHub
GPU programming related news and material links
☆2,245Jun 15, 2026Updated last month
gau-nernst / learn-cuda
View on GitHub
Learn CUDA with PyTorch
☆357Jun 1, 2026Updated last month
gfvvz / triton-learning-materials
View on GitHub
Triton Compiler related materials.
☆45Mar 16, 2026Updated 4 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
huggingface / picotron
View on GitHub
Minimalistic 4D-parallelism distributed training framework for education purpose
☆2,260Aug 26, 2025Updated 11 months ago
BobMcDear / attorch
View on GitHub
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆605May 13, 2026Updated 2 months ago
IntelLabs / EquiTriton
View on GitHub
EquiTriton is a project that seeks to implement high-performance kernels for commonly used building blocks in equivariant neural networks…
☆74May 25, 2026Updated 2 months ago
pytorch / helion
View on GitHub
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
☆912Updated this week
pranjalssh / fast.cu
View on GitHub
Fastest kernels written from scratch
☆587Sep 18, 2025Updated 10 months ago
Deep-Learning-Profiling-Tools / triton-viz
View on GitHub
☆351Jul 16, 2026Updated last week
open-lm-engine / accelerated-model-architectures
View on GitHub
☆91Updated this week
hkproj / 100-days-of-gpu
View on GitHub
☆440Apr 10, 2025Updated last year
gpu-mode / kernelbot
View on GitHub
Write a fast kernel and see how you compare against the best humans and AI on gpumode.com
☆105Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
linkedin / Liger-Kernel
View on GitHub
Efficient Triton Kernels for LLM Training
☆6,537Updated this week
a-hamdi / native-sparse-attention
View on GitHub
☆15Feb 23, 2025Updated last year
danielvegamyhre / ml-perf-reading-group
View on GitHub
EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)
☆36Mar 20, 2026Updated 4 months ago
ademeure / DeeperGEMM
View on GitHub
DeeperGEMM: crazy optimized version
☆86May 5, 2025Updated last year
microsoft / TileFusion
View on GitHub
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
☆115Jun 28, 2025Updated last year
SzymonOzog / Penny
View on GitHub
Hand-Rolled GPU communications library
☆96Nov 25, 2025Updated 8 months ago
wafer-ai / gpu-perf-engineering-resources
View on GitHub
A curriculum for learning about gpu performance engineering, from scratch to what the frontier AI labs do
☆1,273Apr 27, 2026Updated 3 months ago
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,503Jul 20, 2026Updated last week
daniel-geon-park / triton_bwd
View on GitHub
Automatic differentiation for Triton Kernels
☆29Aug 12, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Dao-AILab / quack
View on GitHub
A Quirky Assortment of CuTe Kernels
☆1,076Updated this week
fla-org / flash-linear-attention
View on GitHub
🚀 Efficient implementations for emerging model architectures
☆5,463Updated this week
RadeonFlow / RadeonFlow_Kernels
View on GitHub
Efficient implementation of DeepSeek Ops (Blockwise FP8 GEMM, MoE, and MLA) for AMD Instinct MI300X
☆79Feb 11, 2026Updated 5 months ago
cherichy / tilecute
View on GitHub
☆32Jul 2, 2025Updated last year
naklecha / llm-inference-optimizations-explained
View on GitHub
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆86May 22, 2025Updated last year
zinccat / Awesome-Triton-Kernels
View on GitHub
Collection of kernels written in Triton language
☆200Jan 27, 2026Updated 6 months ago
luongthecong123 / fp8-quant-matmul
View on GitHub
Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.
☆19Feb 9, 2026Updated 5 months ago