Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.
☆1,342Mar 19, 2026Updated last month
Alternatives and similar repositories for autokernel
Users that are interested in autokernel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLM4Kernel: A Survey of Large Language Models for GPU Kernel Development☆64Mar 31, 2026Updated last month
- A CUDA kernel optimization toolkit for validation, benchmarking, Nsight Compute profiling, bottleneck analysis, and iterative tuning. It …☆146Apr 22, 2026Updated 2 weeks ago
- Triton kernels for Flux☆23Jul 7, 2025Updated 9 months ago
- ☆45Nov 1, 2025Updated 6 months ago
- ☆97Mar 21, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 👷 Build compute kernels☆214Apr 6, 2026Updated last month
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated last year
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆18Apr 22, 2025Updated last year
- FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels☆163Apr 26, 2026Updated last week
- infinite coding agent☆79Updated this week
- This is the official implementation for the paper "Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-…☆36Mar 30, 2026Updated last month
- Low overhead tracing library and trace visualizer for pipelined CUDA kernels☆137Nov 26, 2025Updated 5 months ago
- A flask-app that helps me write blogpost.☆13Mar 15, 2025Updated last year
- KV Cache & LoRA for minGPT☆62Mar 4, 2026Updated 2 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A PyTorch-native inference engine with cache, parallelism, quantization for Diffusion Transformers.☆1,156Apr 29, 2026Updated last week
- KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)☆971Mar 24, 2026Updated last month
- Kanade is a single-layer disentangled speech tokenizer that extracts compact tokens suitable for both generative and discriminative model…☆95Apr 3, 2026Updated last month
- QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference☆122Mar 6, 2024Updated 2 years ago
- Making code edting up to 7.7x faster using multi-layer speculation☆23Feb 20, 2025Updated last year
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- Triton for OpenCL backend, and use mlir-translate to get source OpenCL code☆27Aug 27, 2025Updated 8 months ago
- RAPIDS Deployment Documentation☆15Apr 17, 2026Updated 2 weeks ago
- implement GPT-OSS 20B & 120B C++ inference from scratch on AMD GPUs☆172Oct 25, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Fast low-bit matmul kernels in Triton☆446Apr 27, 2026Updated last week
- Official Repo of CudaForge☆78Dec 2, 2025Updated 5 months ago
- The power-law compressed phase-aware asymmetric (PLCPA-ASYM) loss☆14Sep 4, 2023Updated 2 years ago
- A hackable library for running and fine-tuning modern transformer models on commodity and alternative GPUs, powered by tinygrad.☆29Feb 10, 2026Updated 2 months ago
- ☆49Mar 3, 2026Updated 2 months ago
- ☆41Updated this week
- ☆19Feb 25, 2026Updated 2 months ago
- Mirage Persistent Kernel: Compiling LLMs into a MegaKernel☆2,234Apr 30, 2026Updated last week
- Fork of https://github.com/elastic/supply-chain-monitor with local AI backend (vLLM/llama.cpp)☆61Apr 2, 2026Updated last month
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-perfo…☆122Apr 15, 2026Updated 3 weeks ago
- CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning☆438Mar 30, 2026Updated last month
- High-Performance FP32 GEMM on CUDA devices☆122Jan 21, 2025Updated last year
- ☆42Dec 15, 2022Updated 3 years ago
- Paging Debug tool for GDB using python☆13Jun 4, 2022Updated 3 years ago
- Shor's algorithm simulation using CUDA☆19Nov 10, 2019Updated 6 years ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆60Oct 18, 2025Updated 6 months ago