KernelTuner/kernel_launcher

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KernelTuner/kernel_launcher)

KernelTuner / kernel_launcher

Using C++ magic to capture CUDA kernels and tune them with Kernel Tuner

☆21

Alternatives and similar repositories for kernel_launcher

Users that are interested in kernel_launcher are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NTNU-HPC-Lab / BAT
View on GitHub
A GPU benchmark suite for autotuners
☆19Feb 20, 2024Updated 2 years ago
pyxis-roc / ptxparser
View on GitHub
A parser for PTX 6.5
☆13Jun 19, 2023Updated 3 years ago
ariasanovsky / ptx-parser
View on GitHub
☆11Jun 9, 2023Updated 3 years ago
e-ago / hpgmg-cuda-async
View on GitHub
GPUDirect Async implementation of HPGMG-FV CUDA
☆11May 11, 2018Updated 8 years ago
Zymrael / torchSODE
View on GitHub
PyTorch block-diagonal ODE CUDA solver, designed for gradient-based optimization
☆16Apr 27, 2020Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
reger-men / HPL_GPU
View on GitHub
High-Performance Linpack Benchmark adopted version for GPU backend
☆12Sep 12, 2022Updated 3 years ago
UoB-HPC / minifmm
View on GitHub
☆11Aug 8, 2021Updated 4 years ago
stijnh / HyGraph
View on GitHub
High-performance graph processing on hybrid CPU-GPU platforms by using dynamic load-balancing
☆12Sep 15, 2016Updated 9 years ago
SC-Tech-Program / Author-Kit
View on GitHub
Instructions and templates for SC authors
☆17Aug 22, 2021Updated 4 years ago
seb-v / amd_challenge_solutions
View on GitHub
☆19Jun 6, 2025Updated last year
gty111 / SimpleUseGpgpuSim
View on GitHub
GPGPU-SIM 使用篇
☆14Nov 12, 2022Updated 3 years ago
ivanradanov / rodinia
View on GitHub
Rodinia benchmark
☆24Jul 5, 2024Updated 2 years ago
dendisuhubdy / blaze
View on GitHub
C++ HPC Math Library
☆48Dec 9, 2019Updated 6 years ago
baco-authors / baco
View on GitHub
☆17Dec 8, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
JohndeVostok / APE
View on GitHub
A GPU FP32 computation method with Tensor Cores.
☆27Dec 8, 2025Updated 7 months ago
RIKEN-RCCS / hpl-ai
View on GitHub
An HPL-AI implementation for Fugaku
☆24Jun 29, 2021Updated 5 years ago
UoB-HPC / neutral
View on GitHub
A Monte Carlo Neutron Transport Mini-App
☆15Apr 15, 2019Updated 7 years ago
hpcgame / hpcgame-platform-0th
View on GitHub
HPC Game Platform
☆11Apr 20, 2023Updated 3 years ago
xnd-project / cuda-benchmarks
View on GitHub
Collection of CUDA benchmarks, with a focus on unified vs. explicit memory management.
☆21Oct 15, 2019Updated 6 years ago
getianao / ngAP
View on GitHub
ngAP's artifact for ASPLOS'24
☆25Jul 29, 2025Updated 11 months ago
ColinKennedy / tree-sitter-objdump
View on GitHub
Parse objdump files using tree-sitter
☆13Nov 22, 2023Updated 2 years ago
eth-cscs / Tiled-MM
View on GitHub
Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.
☆33Apr 2, 2025Updated last year
thustorage / ccnvme
View on GitHub
ccNVMe: crash consistent non-volatile memory express
☆14Aug 17, 2021Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
gevtushenko / cuda_benchmark
View on GitHub
A library to benchmark CUDA code, similar to google benchmark.
☆30Apr 18, 2021Updated 5 years ago
brycelelbach / mditerator
View on GitHub
A vectorizable multi-dimensional iterator for C++ using the Coroutines TS
☆12Jun 5, 2022Updated 4 years ago
Jokeren / GPA
View on GitHub
GPU Performance Advisor
☆66Jul 25, 2022Updated 3 years ago
gtcasl / hpc-benchmarks
View on GitHub
Collection of full, mini, proxy, and benchmark apps.
☆11Feb 14, 2020Updated 6 years ago
UK-MAC / TeaLeaf
View on GitHub
A mini-app to solve the heat conduction equation
☆15Jul 1, 2020Updated 6 years ago
shen203 / GPU_Microbenchmark
View on GitHub
☆25Jun 24, 2022Updated 4 years ago
UoB-HPC / miniBUDE
View on GitHub
A BUDE virtual-screening benchmark, in many programming models
☆31Oct 15, 2024Updated last year
rapidsai / nvgraph
View on GitHub
☆31Aug 28, 2020Updated 5 years ago
gunrock / loops
View on GitHub
🎃 GPU load-balancing library for regular and irregular computations.
☆67Jun 25, 2026Updated 3 weeks ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
squeek502 / micro-zigfmt
View on GitHub
micro editor plugin that provides zig fmt integration
☆15Jul 1, 2022Updated 4 years ago
regehr / pldi22-llvm-tutorial
View on GitHub
outline and links for PLDI 2022 tutorial
☆17Jun 13, 2022Updated 4 years ago
KireinaHoro / rocket-zynqmp
View on GitHub
☆13Jan 20, 2021Updated 5 years ago
mooware / msvcfilt
View on GitHub
Demangles Microsoft Visual C++ symbol names
☆15Oct 12, 2016Updated 9 years ago
ROCm / hipFile
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆23Updated this week
Samsung / veles.simd
View on GitHub
Distributed machine learning platform
☆13Aug 20, 2015Updated 10 years ago
arcsysu / SYSU-ARCH
View on GitHub
SYSU-ARCH is a LAB that focuses on the use and extending of simulators.
☆10Dec 19, 2022Updated 3 years ago