zpzim / MSplitGEMM
Large matrix multiplication in CUDA
☆15Updated last year
Alternatives and similar repositories for MSplitGEMM:
Users that are interested in MSplitGEMM are comparing it to the libraries listed below
- CSR-based SpGEMM on nVidia and AMD GPUs☆45Updated 8 years ago
- Sparse-dense matrix-matrix multiplication on GPUs☆15Updated 6 years ago
- Sparse matrix computation library for GPU☆54Updated 4 years ago
- ☆93Updated 8 years ago
- CUDA Tensor Transpose (cuTT) library☆51Updated 7 years ago
- Implementation and analysis of five different GPU based SPMV algorithms in CUDA☆38Updated 6 years ago
- Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA☆32Updated 4 years ago
- Efficient SpGEMM on GPU using CUDA and CSR☆50Updated last year
- Repository holding the code base to AC-SpGEMM : "Adaptive Sparse Matrix-Matrix Multiplication on the GPU"☆28Updated 4 years ago
- bhSPARSE: A Sparse BLAS Library☆16Updated 9 years ago
- Parallel Tensor Infrastructure (ParTI!)☆28Updated 4 years ago
- ☆11Updated 4 years ago
- The Surprisingly ParalleL spArse Tensor Toolkit.☆70Updated 2 years ago
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆198Updated 2 months ago
- CSR-based SpMV on Heterogeneous Processors (Intel Broadwell, AMD Kaveri and nVidia Tegra K1)☆26Updated 9 years ago
- This package includes the implementation for four sparse linear algebra kernels: Sparse-Matrix-Vector-Multiplication (SpMV), Sparse-Trian…☆26Updated 4 years ago
- ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)☆17Updated 5 years ago
- A C++ library for computing large scale tensor contractions.☆36Updated 6 years ago
- CSR5-based SpMV on CPUs, GPUs and Xeon Phi☆102Updated 8 months ago
- ☆37Updated 3 years ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆124Updated 4 years ago
- CLTune: An automatic OpenCL & CUDA kernel tuner☆173Updated 2 years ago
- Multi-GPU Computing Benchmark Suite (CUDA)☆42Updated 7 years ago
- This is a tuned sparse matrix dense vector multiplication(SpMV) library☆21Updated 8 years ago
- Matrix Multiplication on GPU using Shared Memory considering Coalescing and Bank Conflicts☆24Updated 2 years ago
- A Sound and Complete Verification Tool for Warp-Specialized GPU Kernels☆18Updated 9 years ago
- benchmarking miopen☆17Updated 6 years ago
- Examples for HPC course☆39Updated 3 years ago
- Escoin: Efficient Sparse Convolutional Neural Network Inference on GPUs☆15Updated 5 years ago
- GPU implementation of Winograd convolution☆10Updated 7 years ago