gpu-mode/kernelbot

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gpu-mode/kernelbot)

gpu-mode / kernelbot

Write a fast kernel and see how you compare against the best humans and AI on gpumode.com

☆103

Alternatives and similar repositories for kernelbot

Users that are interested in kernelbot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

gpu-mode / kernelboard
View on GitHub
kernelboard is the webapp for https://www.gpumode.com
☆17Updated this week
gpu-mode / popcorn-cli
View on GitHub
☆171Jul 6, 2026Updated 2 weeks ago
gpu-mode / popcorn
View on GitHub
☆25Apr 4, 2026Updated 3 months ago
IST-DASLab / llmq
View on GitHub
Quantized LLM training in pure CUDA/C++.
☆250Updated this week
luongthecong123 / fp8-quant-matmul
View on GitHub
Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.
☆19Feb 9, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
gpu-mode / pygpubench
View on GitHub
GPU kernel benchmarking
☆47Jun 10, 2026Updated last month
RadeonFlow / RadeonFlow_Kernels
View on GitHub
Efficient implementation of DeepSeek Ops (Blockwise FP8 GEMM, MoE, and MLA) for AMD Instinct MI300X
☆79Feb 11, 2026Updated 5 months ago
gpu-mode / reference-kernels
View on GitHub
Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!
☆290Jul 14, 2026Updated last week
meta-pytorch / BackendBench
View on GitHub
Ship correct and fast LLM kernels to PyTorch
☆151Jan 14, 2026Updated 6 months ago
ademeure / cuda-side-boost
View on GitHub
☆60Feb 24, 2026Updated 4 months ago
hao-ai-lab / flash-attention-fp4
View on GitHub
NVFP4 Flash-Attention 4 on BlackWell
☆28Updated this week
seb-v / amd_challenge_solutions
View on GitHub
☆19Jun 6, 2025Updated last year
gau-nernst / learn-cuda
View on GitHub
Learn CUDA with PyTorch
☆352Jun 1, 2026Updated last month
gau-nernst / gn-kernels
View on GitHub
☆34Jun 28, 2026Updated 3 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookexperimental / triton
View on GitHub
Github mirror of trition-lang/triton repo.
☆178Updated this week
alexzhang13 / Triton-Puzzles-Solutions
View on GitHub
Personal solutions to the Triton Puzzles
☆22Jul 18, 2024Updated 2 years ago
gpu-mode / ring-attention
View on GitHub
ring-attention experiments
☆171Oct 17, 2024Updated last year
dropbox / gemlite
View on GitHub
Fast low-bit matmul kernels in Triton
☆477Updated this week
meta-pytorch / FACTO
View on GitHub
Framework for Algorithmic Correctness Testing of Operators
☆16Mar 9, 2026Updated 4 months ago
UmerHA / triton_util
View on GitHub
Make triton easier
☆49Jun 12, 2024Updated 2 years ago
IaroslavElistratov / triton-autodiff
View on GitHub
☆19Nov 11, 2025Updated 8 months ago
ScalingIntelligence / KernelBench
View on GitHub
KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)
☆1,148Mar 24, 2026Updated 3 months ago
meta-pytorch / tritonbench
View on GitHub
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
☆361Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
vipulSharma18 / NCCL-From-First-Principles
View on GitHub
NCCL communication API layer, and transport layer created from first principles.
☆16Aug 20, 2025Updated 11 months ago
gau-nernst / quantized-training
View on GitHub
Explore training for quantized models
☆26Jul 12, 2025Updated last year
charlesfrye / cuda-substrings
View on GitHub
Because it's there.
☆16Sep 22, 2024Updated last year
eligotts / legos
View on GitHub
☆24Jan 22, 2026Updated 5 months ago
huggingface / hf-rocm-kernels
View on GitHub
☆24May 26, 2026Updated last month
IST-DASLab / gemm-fp8
View on GitHub
High Performance FP8 GEMM Kernels for SM89 and later GPUs.
☆21Jan 24, 2025Updated last year
HazyResearch / ThunderKittens
View on GitHub
Tile primitives for speedy kernels
☆3,552Jul 13, 2026Updated last week
gpu-mode / triton-index
View on GitHub
Cataloging released Triton kernels.
☆311Sep 9, 2025Updated 10 months ago
Snektron / gpumode-amd-fp8-mm
View on GitHub
My submission for the GPUMODE/AMD fp8 mm challenge
☆29Jun 4, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
WilliamZhang20 / ECE298A-TPU
View on GitHub
A custom AI chip to be taped out soon!
☆49Dec 20, 2025Updated 7 months ago
wafer-ai / chipbenchmark
View on GitHub
a platform for monitoring the chip situation
☆16Jul 19, 2025Updated last year
IntelLabs / EquiTriton
View on GitHub
EquiTriton is a project that seeks to implement high-performance kernels for commonly used building blocks in equivariant neural networks…
☆74May 25, 2026Updated last month
tokenbender / infinite
View on GitHub
a rubric driven prioritized replay rl algo to maximise continual learning
☆16Oct 12, 2025Updated 9 months ago
rkinas / triton-resources
View on GitHub
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆495Mar 10, 2025Updated last year
HazyResearch / HipKittens
View on GitHub
Fast and Furious AMD Kernels
☆444Jul 10, 2026Updated last week
Maharshi-Pandya / gpu-stuff
View on GitHub
Repository for GPU related kernels for learning/testing purposes
☆19May 27, 2026Updated last month