flagos-ai/awesome-LLM-driven-kernel-generation

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/flagos-ai/awesome-LLM-driven-kernel-generation)

flagos-ai / awesome-LLM-driven-kernel-generation

Review automated kernel generation in the era of LLMs

☆276

Alternatives and similar repositories for awesome-LLM-driven-kernel-generation

Users that are interested in awesome-LLM-driven-kernel-generation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kcxain / Awesome-LLM4Kernel
View on GitHub
LLM4Kernel: A Survey of Large Language Models for GPU Kernel Development
☆76Mar 31, 2026Updated 3 months ago
flagos-ai / FlagTree
View on GitHub
FlagTree is a unified compiler supporting multiple AI chip backends for custom Deep Learning operations, which is forked from triton-lang…
☆305Updated this week
ScalingIntelligence / KernelBench
View on GitHub
KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)
☆1,163Mar 24, 2026Updated 4 months ago
flagos-ai / FlagCX
View on GitHub
FlagCX is a scalable and adaptive cross-chip communication library.
☆222Updated this week
hkust-nlp / KernelGYM
View on GitHub
[KernelGYM & Dr. Kernel] A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations [ICML…
☆196Mar 29, 2026Updated 4 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
flagos-ai / FlagScale
View on GitHub
FlagScale is a large model toolkit based on open-sourced projects.
☆529Updated this week
meta-pytorch / KernelAgent
View on GitHub
Autonomous GPU Kernel Generation & Optimization via Deep Agents
☆491Jul 15, 2026Updated 2 weeks ago
mit-han-lab / KernelWiki
View on GitHub
☆317Jun 9, 2026Updated last month
flagos-ai / FlagGems
View on GitHub
FlagGems is an operator library for large language models implemented in the Triton Language.
☆1,059Updated this week
flagos-ai / KernelGen
View on GitHub
Next-Generation AI-Assisted Kernel Engineering for Multi-Chip Systems
☆68Jul 18, 2026Updated last week
TongmingLAIC / AKO4ALL
View on GitHub
Agentic Kernel Optimization for All — automated GPU kernel optimization for any kernel, any hardware, any language
☆332May 31, 2026Updated last month
flashinfer-ai / flashinfer-bench
View on GitHub
Building the Virtuous Cycle for AI-driven LLM Systems
☆261May 1, 2026Updated 2 months ago
NVIDIA / SOL-ExecBench
View on GitHub
A benchmark of real-world DL kernel problems
☆265Jul 15, 2026Updated 2 weeks ago
wzzll123 / MultiKernelBench
View on GitHub
MultiArchKernelBench: A Multi-Platform Benchmark for Kernel Generation
☆66Jul 8, 2026Updated 3 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
NVlabs / KernelBlaster
View on GitHub
A framework for in context learning for code optimization
☆60Mar 14, 2026Updated 4 months ago
mit-han-lab / kernel-design-agents
View on GitHub
☆780Jun 2, 2026Updated last month
0satan0 / KernelMem
View on GitHub
☆23Feb 14, 2026Updated 5 months ago
caoshiyi / K-Search
View on GitHub
Automated High-Performance GPU Kernel Generation
☆120Jun 1, 2026Updated last month
BytedTsinghua-SIA / CUDA-Agent
View on GitHub
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
☆1,122Jul 8, 2026Updated 3 weeks ago
KernelFlow-ops / cuda-optimized-skill
View on GitHub
A CUDA kernel optimization toolkit for validation, benchmarking, Nsight Compute profiling, bottleneck analysis, and iterative tuning. It …
☆193Apr 22, 2026Updated 3 months ago
RightNow-AI / autokernel
View on GitHub
Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.
☆1,488Mar 19, 2026Updated 4 months ago
ChandlerGuan / kperfir_artifact
View on GitHub
☆19May 9, 2025Updated last year
Infatoshi / KernelBench-v3
View on GitHub
☆21Jun 12, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Dogacel / auto-gpu-kernel
View on GitHub
Winner 🏆 (Agent-only) MLSys 2026 - FlashInfer AI Kernel Generation Contest for the DeepSeek Sparse Attention (DSA) track with an average…
☆148Jun 10, 2026Updated last month
NVlabs / SOLAR
View on GitHub
Speed of Light Analysis for ML Model Runtime
☆108Jun 10, 2026Updated last month
ByteDance-Seed / cudaLLM
View on GitHub
☆149Aug 18, 2025Updated 11 months ago
thunlp / TritonBench
View on GitHub
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
☆138Jun 14, 2025Updated last year
RLsys-Foundation / TritonForge
View on GitHub
🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…
☆146Nov 10, 2025Updated 8 months ago
SakanaAI / robust-kbench
View on GitHub
☆101Nov 22, 2025Updated 8 months ago
mit-han-lab / ncu-report-skill
View on GitHub
☆160May 24, 2026Updated 2 months ago
AMD-AGI / GEAK
View on GitHub
Generating Efficient AI-Centric Kernels
☆139Updated this week
flashinfer-ai / mlsys26-agent-baseline
View on GitHub
☆33Mar 12, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
kcxain / Awesome-Kernel-Skills
View on GitHub
Public skills collected from well-known open-source projects focused on LLM infrastructure, GPU kernels, compiler/operator development
☆27May 7, 2026Updated 2 months ago
mit-han-lab / mlsys2026-flashinfer-contest
View on GitHub
☆109Jun 13, 2026Updated last month
technillogue / ptx-isa-markdown
View on GitHub
PTX ISA 9.1 documentation converted to searchable markdown. Includes Claude Code skill for CUDA development.
☆220Dec 24, 2025Updated 7 months ago
TongmingLAIC / AKO4X
View on GitHub
Agentic Kernel Optimization — advanced & eXtensible: a closed-loop, campaign-based multi-agent system for optimizing GPU kernels (benchma…
☆61May 31, 2026Updated last month
ROCm / tritonBLAS
View on GitHub
A lightweight triton-based General Matrix Multiplication (GEMM) library.
☆66Jul 21, 2026Updated last week
flashinfer-ai / flashinfer-bench-starter-kit
View on GitHub
FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels
☆178Apr 26, 2026Updated 3 months ago
NVIDIA / compute-eval
View on GitHub
Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Lar…
☆143May 19, 2026Updated 2 months ago