meta-pytorch/KernelAgent

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/meta-pytorch/KernelAgent)

meta-pytorch / KernelAgent

Autonomous GPU Kernel Generation & Optimization via Deep Agents

☆486

Alternatives and similar repositories for KernelAgent

Users that are interested in KernelAgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ScalingIntelligence / KernelBench
View on GitHub
KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)
☆1,148Mar 24, 2026Updated 3 months ago
flashinfer-ai / flashinfer-bench
View on GitHub
Building the Virtuous Cycle for AI-driven LLM Systems
☆259May 1, 2026Updated 2 months ago
meta-pytorch / tritonbench
View on GitHub
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
☆361Updated this week
NVIDIA / SOL-ExecBench
View on GitHub
A benchmark of real-world DL kernel problems
☆257Updated this week
AMD-AGI / GEAK
View on GitHub
Generating Efficient AI-Centric Kernels
☆121Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
caoshiyi / K-Search
View on GitHub
Automated High-Performance GPU Kernel Generation
☆120Jun 1, 2026Updated last month
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,494Updated this week
pytorch / helion
View on GitHub
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
☆910Updated this week
mirage-project / mirage
View on GitHub
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
☆2,376Updated this week
mit-han-lab / kernel-design-agents
View on GitHub
☆754Jun 2, 2026Updated last month
flashinfer-ai / mlsys26-agent-baseline
View on GitHub
☆33Mar 12, 2026Updated 4 months ago
meta-pytorch / BackendBench
View on GitHub
Ship correct and fast LLM kernels to PyTorch
☆151Jan 14, 2026Updated 6 months ago
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆5,988Updated this week
TongmingLAIC / AKO4ALL
View on GitHub
Agentic Kernel Optimization for All — automated GPU kernel optimization for any kernel, any hardware, any language
☆323May 31, 2026Updated last month
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
NVIDIA / CompileIQ
View on GitHub
An Optimizer for Nvidia Compilers.
☆107Jul 3, 2026Updated 2 weeks ago
BytedTsinghua-SIA / CUDA-Agent
View on GitHub
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
☆1,114Jul 8, 2026Updated last week
flagos-ai / awesome-LLM-driven-kernel-generation
View on GitHub
Review automated kernel generation in the era of LLMs
☆273Jun 25, 2026Updated 3 weeks ago
thunlp / TritonBench
View on GitHub
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
☆137Jun 14, 2025Updated last year
hkust-nlp / KernelGYM
View on GitHub
[KernelGYM & Dr. Kernel] A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations [ICML…
☆193Mar 29, 2026Updated 3 months ago
RightNow-AI / autokernel
View on GitHub
Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.
☆1,469Mar 19, 2026Updated 4 months ago
tile-ai / tilelang
View on GitHub
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
☆6,674Updated this week
apache / tvm-ffi
View on GitHub
Open ABI and FFI for Machine Learning Systems
☆434Updated this week
kcxain / Awesome-LLM4Kernel
View on GitHub
LLM4Kernel: A Survey of Large Language Models for GPU Kernel Development
☆76Mar 31, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
tile-ai / TileRT
View on GitHub
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
☆1,573Jul 14, 2026Updated last week
Dao-AILab / quack
View on GitHub
A Quirky Assortment of CuTe Kernels
☆1,063Updated this week
mit-han-lab / KernelWiki
View on GitHub
☆310Jun 9, 2026Updated last month
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆732Jul 4, 2026Updated 2 weeks ago
flashinfer-ai / flashinfer-bench-starter-kit
View on GitHub
FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels
☆175Apr 26, 2026Updated 2 months ago
RLsys-Foundation / TritonForge
View on GitHub
🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…
☆146Nov 10, 2025Updated 8 months ago
OptimAI-Lab / CudaForge
View on GitHub
Official Repo of CudaForge
☆84Dec 2, 2025Updated 7 months ago
flashinfer-ai / cutlass-viz
View on GitHub
☆65Apr 26, 2025Updated last year
mit-han-lab / ncu-report-skill
View on GitHub
☆156May 24, 2026Updated last month
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
facebookexperimental / triton
View on GitHub
Github mirror of trition-lang/triton repo.
☆178Updated this week
flashinfer-ai / cubloaty
View on GitHub
a size profiler for cuda binary
☆71Jan 15, 2026Updated 6 months ago
ROCm / FlyDSL
View on GitHub
FlyDSL is the Python front‑end of the project: Flexible LaYout DSL.
☆237Updated this week
Dogacel / auto-gpu-kernel
View on GitHub
Winner 🏆 (Agent-only) MLSys 2026 - FlashInfer AI Kernel Generation Contest for the DeepSeek Sparse Attention (DSA) track with an average…
☆148Jun 10, 2026Updated last month
flagos-ai / FlagGems
View on GitHub
FlagGems is an operator library for large language models implemented in the Triton Language.
☆1,053Updated this week
NVIDIA / compute-eval
View on GitHub
Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Lar…
☆143May 19, 2026Updated 2 months ago
uccl-project / mKernel
View on GitHub
mKernel: fast multi-node, multi-GPU fused kernels
☆251Jun 21, 2026Updated 3 weeks ago