karthikeyann / cuda-calculator
HTML/JS port of CUDA Occupancy Calculator
β17Updated 3 years ago
Alternatives and similar repositories for cuda-calculator:
Users that are interested in cuda-calculator are comparing it to the libraries listed below
- Online CUDA Occupancy Calculatorβ75Updated 3 years ago
- π GPU load-balancing library for regular and irregular computations.β62Updated 9 months ago
- β19Updated 5 years ago
- A GPU algorithm for sparse matrix-matrix multiplicationβ70Updated 4 years ago
- Benchmark for measuring the performance of sparse and irregular memory access.β76Updated last month
- Instructions and templates for SC authorsβ16Updated 3 years ago
- A library to benchmark CUDA code, similar to google benchmark.β28Updated 3 years ago
- cuASR: CUDA Algebra for Semiringsβ35Updated 2 years ago
- Error-Free Transformations as building blocks for compensated algorithmsβ14Updated 2 years ago
- CUDA Dynamic Memory Allocator for SOA Data Layoutβ35Updated 3 years ago
- A unified framework across multiple programming platformsβ36Updated 9 months ago
- Distributed Communication-Optimal LU-factorization Algorithmβ12Updated 3 years ago
- A thin wrapper around miOpen and cuDNNβ41Updated last year
- A GPU benchmark suite for assessing on-chip GPU memory bandwidthβ105Updated 7 years ago
- development repository for the open earth compilerβ79Updated 4 years ago
- GPU Code optimizer for stencil computations. Refer to our IPDPS'19 paper for more detailsβ24Updated 5 years ago
- Next generation library for iterative sparse solvers for ROCm platformβ78Updated this week
- sparse matrix pre-processing libraryβ81Updated 10 months ago
- CUDA and OpenMP implementations of C2R/R2C inplace transpositionβ46Updated 10 years ago
- β91Updated 8 years ago
- Subset of BLAS routines optimized for NVIDIA GPUsβ68Updated 2 years ago
- This tool serves as a test harness for different optimization techniques to improve stencil computations performance in shared and distriβ¦β20Updated 2 years ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.β66Updated this week
- CSR-based SpGEMM on nVidia and AMD GPUsβ45Updated 8 years ago
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.β36Updated 6 months ago
- Efficient SpGEMM on GPU using CUDA and CSRβ52Updated last year
- Chaiβ43Updated last year
- A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037β41Updated last year
- β14Updated 4 years ago
- A dynamic analysis tool to detect floating-point errors in HPC applications.β33Updated this week