KernelTuner/kernel_tuner

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KernelTuner/kernel_tuner)

KernelTuner / kernel_tuner

Kernel Tuner

☆408

Alternatives and similar repositories for kernel_tuner

Users that are interested in kernel_tuner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

KernelTuner / kernel_launcher
View on GitHub
Using C++ magic to capture CUDA kernels and tune them with Kernel Tuner
☆22Sep 12, 2025Updated 10 months ago
NTNU-HPC-Lab / BAT
View on GitHub
A GPU benchmark suite for autotuners
☆19Feb 20, 2024Updated 2 years ago
CNugteren / CLTune
View on GitHub
CLTune: An automatic OpenCL & CUDA kernel tuner
☆185Dec 12, 2022Updated 3 years ago
HiPerCoRe / KTT
View on GitHub
Kernel Tuning Toolkit
☆71Jun 19, 2026Updated last month
KernelTuner / kernel_tuner_tutorial
View on GitHub
A hands-on introduction to tuning GPU kernels using Kernel Tuner https://github.com/KernelTuner/kernel_tuner/
☆37Oct 29, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
KernelTuner / kernel_float
View on GitHub
CUDA/HIP header-only library for low-precision (16 bit, 8 bit) and vectorized GPU kernel development
☆24Jun 18, 2026Updated last month
ROCm / roc-stdpar
View on GitHub
☆20Jan 17, 2024Updated 2 years ago
NVIDIA / nvbench
View on GitHub
CUDA Kernel Benchmarking Library
☆905Updated this week
eyalroz / cuda-kat
View on GitHub
CUDA kernel author's tools
☆116Apr 24, 2022Updated 4 years ago
gptune / GPTune
View on GitHub
☆83Jun 25, 2026Updated 3 weeks ago
NVIDIA / cuCollections
View on GitHub
☆655Updated this week
hpcgarage / cuASR
View on GitHub
cuASR: CUDA Algebra for Semirings
☆49Aug 22, 2022Updated 3 years ago
NVIDIA / nvbench_demo
View on GitHub
Simple starter CMake project that uses NVBench.
☆15May 6, 2025Updated last year
argonne-lcf / AIaccelerators-SC23-tutorial
View on GitHub
AI Accelerators-SC23-tutorial Repository
☆12Nov 12, 2023Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
seb-v / fp32_sgemm_amd
View on GitHub
Super fast FP32 matrix multiplication on RDNA3
☆92Mar 30, 2025Updated last year
gunrock / loops
View on GitHub
🎃 GPU load-balancing library for regular and irregular computations.
☆67Jun 25, 2026Updated 3 weeks ago
harrism / ranger
View on GitHub
Generate simple index ranges in C++ and CUDA C++
☆39Jun 14, 2023Updated 3 years ago
Jokeren / GPA
View on GitHub
GPU Performance Advisor
☆66Jul 25, 2022Updated 3 years ago
ORNL / HeCBench
View on GitHub
☆300Updated this week
NervanaSystems / maxas
View on GitHub
Assembler for NVIDIA Maxwell architecture
☆1,073Jan 3, 2023Updated 3 years ago
vetter / shoc
View on GitHub
The SHOC Benchmark Suite
☆262Oct 6, 2025Updated 9 months ago
RRZE-HPC / gpu-benches
View on GitHub
collection of benchmarks to measure basic GPU capabilities
☆530Oct 24, 2025Updated 9 months ago
siboehm / SGEMM_CUDA
View on GitHub
Fast CUDA matrix multiplication from scratch
☆1,259Sep 2, 2025Updated 10 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
md2z34 / winograd_gpu
View on GitHub
GPU implementation of Winograd convolution
☆10Oct 23, 2017Updated 8 years ago
ekondis / mixbench
View on GitHub
A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)
☆463Jul 12, 2026Updated last week
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,119Updated this week
wangzyon / NVIDIA_SGEMM_PRACTICE
View on GitHub
Step-by-step optimization of CUDA SGEMM
☆486Mar 30, 2022Updated 4 years ago
vortexgpgpu / NVPTX-SPIRV-Translator
View on GitHub
The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.
☆45Oct 25, 2021Updated 4 years ago
OSU-STARLAB / UVM_benchmark
View on GitHub
☆34Sep 9, 2020Updated 5 years ago
intel / intel-application-migration-tool-for-openacc-to-openmp
View on GitHub
OpenACC* to OpenMP* API assisting migration tool
☆41Dec 15, 2025Updated 7 months ago
NVIDIA / cub
View on GitHub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
☆1,840Oct 9, 2023Updated 2 years ago
yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
View on GitHub
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
☆420Jan 2, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ROCm / rocprof-compute-viewer
View on GitHub
☆62Jul 16, 2026Updated last week
nlesc-recruit / PowerSensor3
View on GitHub
PowerSensor is a low-cost, custom-built device that measures the instantaneous power consumption of GPUs and other devices at a high time…
☆11Updated this week
Jokeren / Awesome-GPU
View on GitHub
Awesome resources for GPUs
☆635Mar 10, 2026Updated 4 months ago
chai-benchmarks / chai
View on GitHub
Chai
☆49Nov 14, 2025Updated 8 months ago
HAWAIILAB / cuda-flux
View on GitHub
CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels
☆33Mar 15, 2021Updated 5 years ago
HazyResearch / ThunderKittens
View on GitHub
Tile primitives for speedy kernels
☆3,561Jul 13, 2026Updated last week
boringlee24 / socc22-miso
View on GitHub
MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant GPU Clusters
☆21Apr 21, 2023Updated 3 years ago