ksopyla / CudaDotProdLinks

Different implementation of sparse matrix multiplication. All matrices are in CSR format. The code contains different CUDA kernels for multiply sparse matrix vs dense vector and sparse matrix vs another sparse matrix. It contains several cuda kernel for sparse matrix dense vector product and sparse matrix sparse matrix product.

☆16

Alternatives and similar repositories for CudaDotProd

Users that are interested in CudaDotProd are comparing it to the libraries listed below

Sorting:

cuMF / cumf_sgd
CUDA Matrix Factorization Library with Stochastic Gradient Descent (SGD)
☆71Updated 7 years ago
linnanwang / BLASX
a heterogeneous multiGPU level-3 BLAS library
☆45Updated 5 years ago
danghvu / cudaSpmv
CUDA Sparse-Matrix Vector Multiplication, using Sliced Coordinate format
☆22Updated 7 years ago
IntelLabs / SpMP
sparse matrix pre-processing library
☆83Updated last year
dmlc / nnvm-fusion
Kernel Fusion and Runtime Compilation Based on NNVM
☆70Updated 8 years ago
flame / fmm-gen
Generating Families of Practical Fast Matrix Multiplication Algorithms
☆12Updated 8 years ago
attractivechaos / matmul
Benchmarking matrix multiplication implementations
☆100Updated 8 years ago
matex-org / matex
Machine Learning Toolkit for Extreme Scale (MaTEx)
☆110Updated 6 years ago
NVIDIA / cnmem
A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory
☆297Updated 6 years ago
cuMF / culda_cgs
Efficient LDA solution on GPUs.
☆24Updated 6 years ago
eBay / maxDNN
High Efficiency Convolution Kernel for Maxwell GPU Architecture
☆134Updated 8 years ago
cuihenggang / geeps
GPU-specialized parameter server for GPU machine learning.
☆101Updated 7 years ago
hyln9 / GCNGEMM
Optimized half precision gemm assembly kernels (deprecated due to ROCm)
☆47Updated 8 years ago
tqchen / mshadow
Matrix Shadow:Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning
☆33Updated 8 years ago
RUSH-LAB / Flash
LSH-GPU ANN package
☆94Updated 6 years ago
moskewcz / boda
Boda: A C++ Framework for Efficient Experiments in Computer Vision
☆64Updated 5 years ago
cuMF / cumf_als
CUDA Matrix Factorization Library with Alternating Least Square (ALS)
☆179Updated 6 years ago
springer13 / hptt
High-Performance Tensor Transpose library
☆200Updated 2 years ago
hclhkbu / dlbench
Benchmarking State-of-the-Art Deep Learning Software Tools
☆169Updated 7 years ago
dumerrill / merge-spmv
☆93Updated 8 years ago
EBD-CREST / nsparse
Sparse matrix computation library for GPU
☆56Updated 5 years ago
ShadenSmith / splatt
The Surprisingly ParalleL spArse Tensor Toolkit.
☆71Updated 3 years ago
krocki / ArrayLSTM
GPU/CPU (CUDA) Implementation of "Recurrent Memory Array Structures", Simple RNN, LSTM, Array LSTM..
☆25Updated 5 years ago
DBAIWangGroup / SRS
SRS - Fast Approximate Nearest Neighbor Search in High Dimensional Euclidean Space With a Tiny Index
☆55Updated 10 years ago
narayanan2004 / GraphMat
GraphMat graph analytics framework
☆102Updated 2 years ago
bryancatanzaro / kmeans
kmeans
☆54Updated 9 years ago
openai / openai-gemm
Open single and half precision gemm implementations
☆381Updated 2 years ago
Xtra-Computing / Medusa
Medusa: Building GPU-based Parallel Sparse Graph Applications with Sequential C/C++ Code
☆62Updated 4 years ago
deeplearningais / CUV
Matrix library for CUDA in C++ and Python
☆196Updated 8 years ago
naibaf7 / libdnn
Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL
☆136Updated 8 years ago