hclhkbu/gcoospdm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hclhkbu/gcoospdm)

hclhkbu / gcoospdm

Sparse-dense matrix-matrix multiplication on GPUs

☆14

Alternatives and similar repositories for gcoospdm

Users that are interested in gcoospdm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

danghvu / cudaSpmv
View on GitHub
CUDA Sparse-Matrix Vector Multiplication, using Sliced Coordinate format
☆22Jun 8, 2018Updated 8 years ago
oresths / tSparse
View on GitHub
A GPU algorithm for sparse matrix-matrix multiplication
☆74Oct 1, 2020Updated 5 years ago
shamanDevel / cuMat
View on GitHub
An expression template based linear algebra library running completely on the GPU using CUDA
☆26Jun 24, 2021Updated 5 years ago
GATECH-EIC / LLM4HWDesign_Starting_Toolkit
View on GitHub
LLM4HWDesign Starting Toolkit
☆19Oct 4, 2024Updated last year
owensgroup / merge-spmm
View on GitHub
Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018
☆73Oct 5, 2020Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
leefige / radik
View on GitHub
Scalable radix top-k selection on GPUs.
☆23Jan 27, 2025Updated last year
YusukeNagasaka / Batched-SpMM
View on GitHub
New batched algorithm for sparse matrix-matrix multiplication (SpMM)
☆16May 7, 2019Updated 7 years ago
chenxuhao / caffe-escoin
View on GitHub
Escoin: Efficient Sparse Convolutional Neural Network Inference on GPUs
☆16Feb 28, 2019Updated 7 years ago
Xilinx / hydra
View on GitHub
☆14Feb 14, 2022Updated 4 years ago
GPUPeople / spECK
View on GitHub
Efficient SpGEMM on GPU using CUDA and CSR
☆61Jul 18, 2023Updated 3 years ago
pigirons / spmv
View on GitHub
This is a tuned sparse matrix dense vector multiplication(SpMV) library
☆23Mar 21, 2016Updated 10 years ago
ceruleangu / Block-Sparse-Benchmark
View on GitHub
Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.
☆23Aug 21, 2020Updated 5 years ago
Bruce-Lee-LY / cuda_back2back_hgemm
View on GitHub
Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.
☆13Nov 3, 2023Updated 2 years ago
intel / AMX-TMUL-Code-Samples
View on GitHub
Code samples related to Intel(R) AMX
☆38Apr 8, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Mudit7 / CUDA-ResNet
View on GitHub
☆13May 8, 2020Updated 6 years ago
SuperScientificSoftwareLaboratory / TileSpMV
View on GitHub
Source code of the IPDPS '21 paper: "TileSpMV: A Tiled Algorithm for Sparse Matrix-Vector Multiplication on GPUs" by Yuyao Niu, Zhengyang…
☆13Aug 12, 2022Updated 3 years ago
quantum-compiler / atlas
View on GitHub
The Atlas multi-GPU quantum circuit simulator.
☆15Aug 17, 2024Updated last year
robjsliwa / llama-agent
View on GitHub
Fun project to run your own LLM chat bot using llama.cpp
☆11Jun 9, 2023Updated 3 years ago
PASSIONLab / distributed_sddmm
View on GitHub
Distributed SDDMM Kernel
☆12Jul 8, 2022Updated 4 years ago
slongle / GPU-Renderer
View on GitHub
Offline renderer using CUDA
☆13Jun 8, 2020Updated 6 years ago
apuaaChen / vectorSparse
View on GitHub
☆32Aug 24, 2022Updated 3 years ago
josehu07 / cuckoo-hashing-CUDA
View on GitHub
Parallel cuckoo hashing on GPUs with CUDA
☆12Sep 27, 2019Updated 6 years ago
marcsous / gpuSparse
View on GitHub
Matlab mex wrappers to cuSPARSE (NVIDIA)
☆11Dec 10, 2025Updated 7 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
vadimkantorov / readaudio
View on GitHub
Read audio with FFmpeg into NumPy/PyTorch via ctypes (standard library module)
☆11Aug 12, 2020Updated 5 years ago
CMU-SAFARI / SparseP
View on GitHub
SparseP is the first open-source Sparse Matrix Vector Multiplication (SpMV) software package for real-world Processing-In-Memory (PIM) ar…
☆80Jun 29, 2022Updated 4 years ago
kisupov / mpres-blas
View on GitHub
Multiple-precision GPU accelerated linear algebra routines (dense and sparse) based on residue number system
☆22Dec 19, 2022Updated 3 years ago
LucasWilkinson / ASpT-mirror
View on GitHub
Mirror of http://gitlab.hpcrl.cse.ohio-state.edu/chong/ppopp19_ae, refactoring for understanding
☆17Oct 20, 2021Updated 4 years ago
eisl-nctu / falco
View on GitHub
A 32-bit out-of-order RISC-V superscalar for Xilinx FPGAs.
☆15Jan 14, 2022Updated 4 years ago
mlcommons / mobile_open
View on GitHub
MLPerf Mobile benchmarks
☆15Apr 28, 2026Updated 2 months ago
lz1313 / BlockCIrculantRNN
View on GitHub
BlockCIrculantRNN (LSTM and GRU) using TensorFlow
☆14Oct 30, 2018Updated 7 years ago
fpga-design-contest / ad-refkit
View on GitHub
autonomous driving contest reference kit
☆10Dec 2, 2021Updated 4 years ago
gchaw / wattless
View on GitHub
GPU-accelerated AES encryption project
☆11Feb 13, 2015Updated 11 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
Chair-for-Security-Engineering / ecmongpu
View on GitHub
ECM Factorization on CUDA-GPUs
☆16Sep 29, 2020Updated 5 years ago
yassram / iterative-closest-point
View on GitHub
Iterative closest point GPU and CPU implementations (google benchmark)
☆19Nov 3, 2020Updated 5 years ago
lkawka / 3d-nearest-neighbor-search-in-kd-tree-cuda
View on GitHub
Finding the nearest neighbor for 3d points in KD tree. Two implementations: nn.cpp (CPU) and nn.cu (CUDA GPU).
☆14Feb 18, 2022Updated 4 years ago
SameLight / ITRI-OpenDLA
View on GitHub
Express DLA implementation for FPGA, revised based on NVDLA.
☆12Oct 17, 2019Updated 6 years ago
mengyangniu / ogbn-papers100m-sage
View on GitHub
☆14Mar 1, 2021Updated 5 years ago
YulhwaKim / cutlass_tilesparse
View on GitHub
CUDA templates for tile-sparse matrix multiplication based on CUTLASS.
☆52Mar 1, 2018Updated 8 years ago
nulidangxueshen / CSR2
View on GitHub
A New Format for SIMD-accelerated SpMV
☆22Apr 4, 2022Updated 4 years ago