spcl/smat

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/spcl/smat)

spcl / smat

Code for High Performance Unstructured SpMM Computation Using Tensor Cores

☆35

Alternatives and similar repositories for smat

Users that are interested in smat are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HPMLL / DTC-SpMM_ASPLOS24
View on GitHub
☆47Jun 19, 2024Updated 2 years ago
guqiqi / Samoyeds
View on GitHub
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores (EuroSys'25)
☆16Jul 17, 2025Updated last year
SuperScientificSoftwareLaboratory / DASP
View on GitHub
Source code of the SC '23 paper: "DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multipli…
☆29Jun 18, 2024Updated 2 years ago
UDC-GAC / venom
View on GitHub
A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
☆62Nov 24, 2023Updated 2 years ago
CRAFT-THU / RoDe
View on GitHub
A Row Decomposition-based Approach for Sparse Matrix Multiplication on GPUs
☆30Nov 29, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
YukeWang96 / TC-GNN_ATC23
View on GitHub
Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.
☆58Oct 16, 2023Updated 2 years ago
google-research / sputnik
View on GitHub
A library of GPU kernels for sparse matrix operations.
☆289Nov 24, 2020Updated 5 years ago
AIS-SNU / GraNNDis_Artifact
View on GitHub
[PACT'24] GraNNDis. A fast and unified distributed graph neural network (GNN) training framework for both full-batch (full-graph) and min…
☆10Aug 13, 2024Updated last year
Hyaloid / AccSpMM
View on GitHub
Official implementation of Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores.
☆17Nov 13, 2025Updated 8 months ago
xxyux / SpInfer
View on GitHub
SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs
☆68Mar 25, 2025Updated last year
microsoft / ConvStencil
View on GitHub
☆37Apr 10, 2024Updated 2 years ago
XiaosongAI / Parallel-SpMV
View on GitHub
稀疏矩阵-向量乘的并行优化算法（OpenMP，AVX）
☆11Jul 7, 2021Updated 5 years ago
SuperScientificSoftwareLaboratory / TileSpGEMM
View on GitHub
Source code of the PPoPP '22 paper: "TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs" by Y…
☆48May 22, 2024Updated 2 years ago
HicrestLaboratory / SPARTA
View on GitHub
SParse AcceleRation on Tensor Architecture
☆18Apr 15, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
spcl / arrow-matrix
View on GitHub
Arrow Matrix Decomposition - Communication-Efficient Distributed Sparse Matrix Multiplication
☆15Mar 25, 2024Updated 2 years ago
hgyhungry / ShflBW_Sparse_NN
View on GitHub
☆16Nov 22, 2022Updated 3 years ago
georgia-tech-synergy-lab / SparseAccelerator-RTL
View on GitHub
Accelerator RTL inspired by VEGETA [HPCA'23] and MicroScopiQ [ISCA'25]
☆15Nov 11, 2025Updated 8 months ago
wudu98 / autoGEMM
View on GitHub
☆15Dec 5, 2024Updated last year
gravins / Anti-SymmetricDGN
View on GitHub
Official code repository for the papers "Anti-Symmetric DGN: a stable architecture for Deep Graph Networks" accepted at ICLR 2023; "Non-D…
☆15Jan 2, 2025Updated last year
weifengliu-ssslab / Benchmark_SpTRSV_using_CSC
View on GitHub
A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves (SpTRSV)
☆23Feb 14, 2020Updated 6 years ago
iamkanghyunchoi / falqon
View on GitHub
Official repository of paper [FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic, NeurIPS 2025]
☆21Dec 2, 2025Updated 7 months ago
casys-kaist / pimba
View on GitHub
Official code repository for "Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving [MICRO'25]"
☆25Oct 23, 2025Updated 9 months ago
ParCIS / FlashSparse
View on GitHub
FlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swa…
☆39Oct 5, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
hgyhungry / ge-spmm
View on GitHub
☆115Jul 3, 2021Updated 5 years ago
CMU-SAFARI / SPARTA
View on GitHub
A novel spatial accelerator for horizontal diffusion weather stencil computation, as described in ICS 2023 paper by Singh et al. (https:/…
☆22Jul 27, 2023Updated 2 years ago
YukeWang96 / QGTC_PPoPP22
View on GitHub
Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.
☆30Feb 12, 2022Updated 4 years ago
apuaaChen / vectorSparse
View on GitHub
☆32Aug 24, 2022Updated 3 years ago
EnigmaHuang / Saad_Book_ForTran
View on GitHub
Some "Formula Translations" for Yousef Saad's book "Iterative Methods for Sparse Linear Systems (2nd Edition)"
☆13Jan 14, 2018Updated 8 years ago
jaewonalive / PeerAiD
View on GitHub
☆21Jun 6, 2024Updated 2 years ago
lluckydog / blockchainlab2023
View on GitHub
☆13May 18, 2024Updated 2 years ago
abhibambhaniya / progressive_gradient_flow_nm_sparsity
View on GitHub
Implementation of NM sparsity recipe presented in the paper "Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers".
☆11Feb 5, 2024Updated 2 years ago
Jazzcharles / CREAM
View on GitHub
Weakly Supervised Object Localization via Class RE-Activation Mapping
☆12Sep 19, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ivan-pi / fortran-ascii
View on GitHub
Fortran routines for manipulating ASCII characters (future pull request to fortran-lang/stdlib https://github.com/fortran-lang/stdlib)
☆13Jan 26, 2024Updated 2 years ago
poojahira / spmv-cuda
View on GitHub
Implementation and analysis of five different GPU based SPMV algorithms in CUDA
☆39Feb 5, 2019Updated 7 years ago
sourceryinstitute / dag
View on GitHub
Directed Acyclic Graphs With Modern Fortran
☆11May 25, 2023Updated 3 years ago
interkosmos / fortran-zstd
View on GitHub
Fortran 2018 interface bindings to Zstandard (zstd)
☆11Jun 13, 2026Updated last month
chemeng / GPGPU-GMRES-Method
View on GitHub
CUDA GPU implementation of GMRES iterative Solver
☆10Apr 16, 2012Updated 14 years ago
spcl / liblsb
View on GitHub
☆24Jan 25, 2023Updated 3 years ago
dongli / fortran-container
View on GitHub
This repository contains some container data structure types for Fortran.
☆13Sep 12, 2021Updated 4 years ago