Kernel Tuning Toolkit
☆70Apr 29, 2026Updated this week
Alternatives and similar repositories for KTT
Users that are interested in KTT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- CLTune: An automatic OpenCL & CUDA kernel tuner☆185Dec 12, 2022Updated 3 years ago
- Kernel Tuner☆393Apr 24, 2026Updated last week
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆33Mar 15, 2021Updated 5 years ago
- A GPU performance prediction toolkit for CUDA programs☆19Mar 25, 2019Updated 7 years ago
- High performance C++ Linear Algebra Library☆16Oct 12, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Work space for golang.org/x/perf version 2☆20Nov 14, 2020Updated 5 years ago
- Mirror of my Go Kyber implementation.☆16May 30, 2018Updated 7 years ago
- A GPU benchmark suite for autotuners☆19Feb 20, 2024Updated 2 years ago
- ☆14Mar 1, 2025Updated last year
- Slides and exercises for persistent memory programming tutorial☆14Nov 14, 2022Updated 3 years ago
- Public proposals, extensions, information and materials from the SYCL working group☆15Jan 26, 2024Updated 2 years ago
- Orio is an open-source extensible framework for the definition of domain-specific languages and generation of optimized code for multiple…☆37Dec 13, 2025Updated 4 months ago
- Library with JIT (Just-in-time) compilation support to optimize performance of small and medium matrix multiplication☆14Apr 27, 2021Updated 5 years ago
- Julia wrapper of CLBlast, a "tuned OpenCL BLAS library".☆14Aug 23, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Prototype for a SPIR-V assembler and dissasembler. It provides a composable Java interface for generating SPIR-V code at runtime.☆14Oct 31, 2025Updated 6 months ago
- ☆34Nov 16, 2022Updated 3 years ago
- LaTeX template for NSFC grant proposal. 国家自然科学基金申请书 LaTeX 模板。☆23Jan 11, 2026Updated 3 months ago
- JOCLBlast - Java bindings for CLBlast☆15Mar 14, 2021Updated 5 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆259Jan 13, 2025Updated last year
- ☆17Dec 8, 2023Updated 2 years ago
- ☆10Jan 21, 2021Updated 5 years ago
- An easy way to run, test, benchmark and tune OpenCL kernel files☆24Aug 25, 2023Updated 2 years ago
- A general cubic equation solver and quartic equation minimisation solver written for CPU and Nvidia GPUs, for more details and results, s…☆10Jun 15, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆24Jan 25, 2023Updated 3 years ago
- Research compiler based on algorithmic skeletons☆23Oct 18, 2014Updated 11 years ago
- ☆17Feb 14, 2024Updated 2 years ago
- A pseudo random number generator library written against the SYCL API.☆11Jun 11, 2019Updated 6 years ago
- A formally-verified provably-safe sandboxing Wasm-to-native compiler☆30Aug 30, 2022Updated 3 years ago
- I-D that describes the algorithm identifiers for NIST's PQC ML-DSA for use in the Internet X.509 Public Key Infrastructure☆14Oct 30, 2025Updated 6 months ago
- A simple utility to create user-specified git commit hashes☆15Nov 24, 2025Updated 5 months ago
- The SHOC Benchmark Suite☆259Oct 6, 2025Updated 6 months ago
- Combinatorial and Geometric modeling with Generic N-dimensional Maps☆49Jun 14, 2019Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Solution to harden TLS security by storing private keys and delegating operations to the Trused Execution Environment☆13Oct 10, 2022Updated 3 years ago
- GPU Static Modeling using PTX and Deep Structured Learning☆18Apr 1, 2020Updated 6 years ago
- a software library containing BLAS functions written in OpenCL☆864Aug 2, 2024Updated last year
- Simple anomaly detection for univariate time series data.☆11Jan 8, 2021Updated 5 years ago
- ☆12Oct 19, 2014Updated 11 years ago
- tools to create performance and roofline plots from measured data☆61Jun 10, 2014Updated 11 years ago
- Device for ANARI generating USD+Omniverse output☆19Apr 14, 2026Updated 2 weeks ago