A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆470Mar 10, 2025Updated last year
Alternatives and similar repositories for triton-resources
Users that are interested in triton-resources are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…☆450Feb 22, 2025Updated last year
- Cataloging released Triton kernels.☆302Sep 9, 2025Updated 7 months ago
- My submission for the GPUMODE/AMD fp8 mm challenge☆29Jun 4, 2025Updated 10 months ago
- Puzzles for learning Triton☆2,374Apr 1, 2026Updated 2 weeks ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆343Apr 11, 2026Updated last week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- GPU Kernels☆223Apr 27, 2025Updated 11 months ago
- Triton Compiler related materials.☆43Mar 16, 2026Updated last month
- Learnings and programs related to CUDA☆437Jun 29, 2025Updated 9 months ago
- Learn CUDA with PyTorch☆274Apr 9, 2026Updated last week
- 100 days of building GPU kernels!☆593Apr 27, 2025Updated 11 months ago
- A 120-day CUDA learning plan covering daily concepts, exercises, pitfalls, and references (including “Programming Massively Parallel Proc…☆903Mar 29, 2025Updated last year
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆201Jun 1, 2025Updated 10 months ago
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆19Feb 9, 2026Updated 2 months ago
- ☆32Jul 2, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆600Aug 12, 2025Updated 8 months ago
- Fast low-bit matmul kernels in Triton☆443Apr 4, 2026Updated 2 weeks ago
- EquiTriton is a project that seeks to implement high-performance kernels for commonly used building blocks in equivariant neural networks…☆68Updated this week
- A bunch of kernels that might make stuff slower 😉☆87Updated this week
- ☆427Apr 10, 2025Updated last year
- GPU programming related news and material links☆2,093Mar 8, 2026Updated last month
- DeeperGEMM: crazy optimized version☆86May 5, 2025Updated 11 months ago
- Write a fast kernel and see how you compare against the best humans and AI on gpumode.com☆91Updated this week
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 8 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆315Mar 31, 2026Updated 2 weeks ago
- Efficient Triton Kernels for LLM Training☆6,279Updated this week
- Collection of kernels written in Triton language☆188Jan 27, 2026Updated 2 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆2,146Aug 26, 2025Updated 7 months ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆106Jun 28, 2025Updated 9 months ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆835Updated this week
- 🚀 Efficient implementations for emerging model architectures☆4,878Updated this week
- Distributed Compiler based on Triton for Parallel Systems☆1,403Apr 10, 2026Updated last week
- Build compute kernels and load them from the Hub.☆596Updated this week
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- making the official triton tutorials actually comprehensible☆144Aug 25, 2025Updated 7 months ago
- coding CUDA everyday!☆74Feb 5, 2026Updated 2 months ago
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆254May 6, 2025Updated 11 months ago
- Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels☆5,497Updated this week
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Mar 24, 2025Updated last year
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆254Updated this week
- learningggggggg 🐳☆616Apr 2, 2025Updated last year