Kernel Tuning Toolkit
☆70Jun 9, 2026Updated this week
Alternatives and similar repositories for KTT
Users that are interested in KTT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- CLTune: An automatic OpenCL & CUDA kernel tuner☆185Dec 12, 2022Updated 3 years ago
- Kernel Tuner☆398Updated this week
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆33Mar 15, 2021Updated 5 years ago
- Implementation of cryptographic primitives in Go☆13Mar 13, 2023Updated 3 years ago
- A GPU performance prediction toolkit for CUDA programs☆18Mar 25, 2019Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- High performance C++ Linear Algebra Library☆16Oct 12, 2020Updated 5 years ago
- Work space for golang.org/x/perf version 2☆20Nov 14, 2020Updated 5 years ago
- Mirror of my Go Kyber implementation.☆16May 30, 2018Updated 8 years ago
- A GPU benchmark suite for autotuners☆19Feb 20, 2024Updated 2 years ago
- ☆14Mar 1, 2025Updated last year
- Slides and exercises for persistent memory programming tutorial☆14Nov 14, 2022Updated 3 years ago
- TLS in Rust (eventually)☆21Mar 12, 2013Updated 13 years ago
- Public proposals, extensions, information and materials from the SYCL working group☆15Jan 26, 2024Updated 2 years ago
- Orio is an open-source extensible framework for the definition of domain-specific languages and generation of optimized code for multiple…☆37Dec 13, 2025Updated 6 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Library with JIT (Just-in-time) compilation support to optimize performance of small and medium matrix multiplication☆14Apr 27, 2021Updated 5 years ago
- Julia wrapper of CLBlast, a "tuned OpenCL BLAS library".☆14Aug 23, 2023Updated 2 years ago
- Prototype for a SPIR-V assembler and dissasembler. It provides a composable Java interface for generating SPIR-V code at runtime.☆14Oct 31, 2025Updated 7 months ago
- ☆34Nov 16, 2022Updated 3 years ago
- JOCLBlast - Java bindings for CLBlast☆15Mar 14, 2021Updated 5 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆259Jan 13, 2025Updated last year
- ☆17Dec 8, 2023Updated 2 years ago
- ☆10Jan 21, 2021Updated 5 years ago
- Predict Performance of GPU Applications using analytical model and Machine Learning☆11Aug 31, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Research compiler based on algorithmic skeletons☆23Oct 18, 2014Updated 11 years ago
- Provides a Simple Way to Calculate ANOVAs From Fitted Linear Models.☆21Jun 10, 2024Updated 2 years ago
- ☆17Feb 14, 2024Updated 2 years ago
- A pseudo random number generator library written against the SYCL API.☆11Jun 11, 2019Updated 7 years ago
- Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involveme…☆21Apr 25, 2024Updated 2 years ago
- I-D that describes the algorithm identifiers for NIST's PQC ML-DSA for use in the Internet X.509 Public Key Infrastructure☆14Oct 30, 2025Updated 7 months ago
- A simple utility to create user-specified git commit hashes☆15Nov 24, 2025Updated 6 months ago
- Gede is a graphical frontend (GUI) to GDB written in C++ and using the Qt4 toolkit. This repository is an unofficial mirror of the websi…☆10Mar 26, 2020Updated 6 years ago
- PIRA - Automatic Instrumentation Refinement☆17Mar 28, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆10Jul 3, 2018Updated 7 years ago
- The SHOC Benchmark Suite☆262Oct 6, 2025Updated 8 months ago
- HiCMA: Hierarchical Computations on Manycore Architectures☆34Mar 19, 2023Updated 3 years ago
- Vulkan compute shader experiment☆11Jan 13, 2021Updated 5 years ago
- ☆23Oct 26, 2019Updated 6 years ago
- High Speed elliptic curve signature system using a 260-bit Granger Moss Prime.☆14Jun 3, 2021Updated 5 years ago
- parser script to process pytorch autograd profiler result, convert json file to excel.☆15Oct 8, 2019Updated 6 years ago