gunrock / loopsLinks
π GPU load-balancing library for regular and irregular computations.
β62Updated last year
Alternatives and similar repositories for loops
Users that are interested in loops are comparing it to the libraries listed below
Sorting:
- β45Updated 4 years ago
- development repository for the open earth compilerβ80Updated 4 years ago
- β93Updated 8 years ago
- Implementation and analysis of five different GPU based SPMV algorithms in CUDAβ41Updated 6 years ago
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernelsβ32Updated 4 years ago
- β18Updated 5 years ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.β91Updated last week
- An extension library of WMMA API (Tensor Core API)β99Updated last year
- Advanced Profiling and Analytics for AMD Hardwareβ159Updated this week
- NUMA-aware multi-CPU multi-GPU data transfer benchmarksβ23Updated last year
- GPU Performance Advisorβ65Updated 2 years ago
- β€οΈ CUDA/C++ GPU graph analytics simplified.β31Updated 2 years ago
- β51Updated 6 years ago
- Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suiteβ66Updated 6 years ago
- Benchmark for measuring the performance of sparse and irregular memory access.β78Updated 2 months ago
- Efficient SpGEMM on GPU using CUDA and CSRβ56Updated last year
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDAβ33Updated 4 years ago
- β102Updated last year
- Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018β72Updated 4 years ago
- A tool for generating information about the matrix multiplication instructions in AMD Radeonβ’ and AMD Instinctβ’ acceleratorsβ106Updated last month
- β247Updated last month
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.β133Updated last year
- A GPU benchmark suite for assessing on-chip GPU memory bandwidthβ106Updated 7 years ago
- Third party assembler and GEMM library for NVIDIA Kepler GPUβ81Updated 5 years ago
- CSR-based SpGEMM on nVidia and AMD GPUsβ46Updated 9 years ago
- Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)β126Updated 2 years ago
- β260Updated last month
- SYCL Benchmark Suiteβ65Updated 3 weeks ago
- β148Updated this week
- A hierarchical collective communications library with portable optimizationsβ35Updated 7 months ago