A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆484Mar 10, 2025Updated last year
Alternatives and similar repositories for triton-resources
Users that are interested in triton-resources are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…☆451Feb 22, 2025Updated last year
- Cataloging released Triton kernels.☆306Sep 9, 2025Updated 8 months ago
- My submission for the GPUMODE/AMD fp8 mm challenge☆29Jun 4, 2025Updated 11 months ago
- Puzzles for learning Triton☆2,457Apr 1, 2026Updated last month
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆355Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- GPU Kernels☆224Apr 27, 2025Updated last year
- Triton Compiler related materials.☆44Mar 16, 2026Updated 2 months ago
- Learnings and programs related to CUDA☆437Jun 29, 2025Updated 11 months ago
- Learn CUDA with PyTorch☆303May 13, 2026Updated 2 weeks ago
- 100 days of building GPU kernels!☆598Apr 27, 2025Updated last year
- A 120-day CUDA learning plan covering daily concepts, exercises, pitfalls, and references (including “Programming Massively Parallel Proc…☆915Mar 29, 2025Updated last year
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆203Jun 1, 2025Updated 11 months ago
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆19Feb 9, 2026Updated 3 months ago
- ☆32Jul 2, 2025Updated 10 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆600May 13, 2026Updated 2 weeks ago
- Fast low-bit matmul kernels in Triton☆458May 15, 2026Updated 2 weeks ago
- A bunch of kernels that might make stuff slower 😉☆90May 20, 2026Updated last week
- ☆427Apr 10, 2025Updated last year
- GPU programming related news and material links☆2,142Mar 8, 2026Updated 2 months ago
- DeeperGEMM: crazy optimized version☆86May 5, 2025Updated last year
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 9 months ago
- Write a fast kernel and see how you compare against the best humans and AI on gpumode.com☆96May 8, 2026Updated 3 weeks ago
- Efficient Triton Kernels for LLM Training☆6,393Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Collection of kernels written in Triton language☆195Jan 27, 2026Updated 4 months ago
- ☆329May 22, 2026Updated last week
- Minimalistic 4D-parallelism distributed training framework for education purpose☆2,188Aug 26, 2025Updated 9 months ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆109Jun 28, 2025Updated 11 months ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆869Updated this week
- 🚀 Efficient implementations for emerging model architectures☆5,139Updated this week
- Distributed Compiler based on Triton for Parallel Systems☆1,440Apr 22, 2026Updated last month
- Build compute kernels and load them from the Hub.☆650May 21, 2026Updated last week
- coding CUDA everyday!☆77Feb 5, 2026Updated 3 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆256May 6, 2025Updated last year
- making the official triton tutorials actually comprehensible☆161May 10, 2026Updated 2 weeks ago
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Mar 24, 2025Updated last year
- Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels☆6,278May 21, 2026Updated last week
- learningggggggg 🐳☆619Apr 2, 2025Updated last year
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆265May 11, 2026Updated 2 weeks ago
- ☆249Nov 24, 2025Updated 6 months ago