weifengliu-ssslab / bhSPARSE
bhSPARSE: A Sparse BLAS Library
☆16Updated 8 years ago
Related projects: ⓘ
- sparse matrix pre-processing library☆81Updated 4 months ago
- CSR-based SpGEMM on nVidia and AMD GPUs☆45Updated 8 years ago
- A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves (SpTRSV)☆19Updated 4 years ago
- This package includes the implementation for Sparse-Matrix-Vector-Multiplication (SpMV) and Sparse-Matrix-Matrix-Multiplication (SpMM) fo…☆10Updated 4 years ago
- CSR-based SpMV on Heterogeneous Processors (Intel Broadwell, AMD Kaveri and nVidia Tegra K1)☆25Updated 9 years ago
- This package includes the implementation for four sparse linear algebra kernels: Sparse-Matrix-Vector-Multiplication (SpMV), Sparse-Trian…☆22Updated 4 years ago
- The SparseX sparse kernel optimization library☆39Updated 5 years ago
- HiCMA: Hierarchical Computations on Manycore Architectures☆27Updated last year
- CSR5-based SpMV on CPUs, GPUs and Xeon Phi☆93Updated 3 months ago
- LonestarGPU: Irregular algorithms parallelized for GPUs☆33Updated 4 years ago
- A Sound and Complete Verification Tool for Warp-Specialized GPU Kernels☆16Updated 9 years ago
- Sparse matrix computation library for GPU☆54Updated 4 years ago
- ☆88Updated 7 years ago
- Compute applications.☆25Updated 4 years ago
- This tool serves as a test harness for different optimization techniques to improve stencil computations performance in shared and distri…☆20Updated last year
- Prototype of OpenSHMEM for NVIDIA GPUs, developed as part of DoE Design Forward☆21Updated 6 years ago
- ☆11Updated this week
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 4 years ago
- Efficient SpGEMM on GPU using CUDA and CSR☆50Updated last year
- A GPU performance prediction toolkit for CUDA programs☆16Updated 5 years ago
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆64Updated last month
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆44Updated 9 years ago
- Automatically exported from code.google.com/p/patus☆15Updated 9 years ago
- Algebraic multigrid benchmark☆28Updated 2 months ago
- Next generation library for iterative sparse solvers for ROCm platform☆74Updated this week
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 6 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆21Updated last week
- Intel Heterogeneous Research Compiler (iHRC)☆25Updated last year
- CUDA Sparse-Matrix Vector Multiplication, using Sliced Coordinate format☆20Updated 6 years ago
- spGPU library for sparse linear algebra on GPUs☆9Updated 2 years ago