cwpearson / cuptiLinks
Profile how CUDA applications create and modify data in memory.
☆14Updated 7 years ago
Alternatives and similar repositories for cupti
Users that are interested in cupti are comparing it to the libraries listed below
Sorting:
- A tool for examining GPU scheduling behavior.☆92Updated last year
- CUDAAdvisor: a GPU profiling tool☆52Updated 7 years ago
- tools to create performance and roofline plots from measured data☆60Updated 11 years ago
- CERE: Codelet Extractor and REplayer☆40Updated 2 years ago
- Enabling on-the-fly manipulations with LLVM IR code of CUDA sources☆125Updated 9 months ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆31Updated last year
- CUPTI GPU Profiler☆40Updated 6 years ago
- A fast and accurate reuse distance analyzer for multi-threaded applications. It leverages existing hardware features in commodity CPUs.☆21Updated 3 years ago
- ☆68Updated 6 years ago
- ☆54Updated 6 years ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆138Updated 2 years ago
- ☆304Updated this week
- Race detector for NVIDIA GPUs, published in SOSP 2021.☆18Updated 11 months ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆85Updated 6 years ago
- MLIR-based partitioning system☆164Updated this week
- Haystack is an analytical cache model that given a program computes the number of cache misses.☆46Updated 6 years ago
- Artifact Evaluation Reproduction for "Software Prefetching for Indirect Memory Accesses", CGO 2017, using CK.☆43Updated 4 years ago
- Conversions to MLIR EmitC☆134Updated last year
- HPC Challenge Benchmark☆68Updated 4 months ago
- Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's☆92Updated last year
- development repository for the open earth compiler☆82Updated 4 years ago
- An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluations☆19Updated 5 years ago
- ☆40Updated 3 years ago
- assembler for NVIDIA FERMI. Imported from Google Code☆75Updated 10 years ago
- Instanciate the Cache Aware Roofline Model on single socket and multisocket systems.☆27Updated 6 years ago
- GPU Performance Advisor☆65Updated 3 years ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆147Updated last week
- A sandbox for quick iteration and experimentation on projects related to IREE, MLIR, and LLVM☆62Updated 10 months ago
- A framework that helps implementing swizzle GPU kernels☆51Updated 5 years ago
- npcomp - An aspirational MLIR based numpy compiler☆51Updated 5 years ago