ceruleangu / Block-Sparse-BenchmarkLinks

Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.

☆23

Alternatives and similar repositories for Block-Sparse-Benchmark

Users that are interested in Block-Sparse-Benchmark are comparing it to the libraries listed below

Sorting:

limenghao / AdaTune
This is the implementation for paper: AdaTune: Adaptive Tensor Program CompilationMade Efficient (NeurIPS 2020).
☆14Updated 4 years ago
HPCRL / ASPLOS_artifact
☆13Updated 4 years ago
BoyuanFeng / APNN-TC
☆19Updated 4 years ago
UofT-EcoSystem / DietCode
DietCode Code Release
☆64Updated 3 years ago
marsupialtail / gpu-sparsert
☆18Updated 5 years ago
uuudown / SBNN
Singular Binarized Neural Network based on GPU Bit Operations (see our SC-19 paper)
☆16Updated 5 years ago
pku-liang / FlexTensor
Automatic Schedule Exploration and Optimization Framework for Tensor Computations
☆181Updated 3 years ago
pku-liang / AMOS
Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators
☆120Updated 3 years ago
anony-sub / chameleon
Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation
☆27Updated 6 years ago
thu-pacman / PET
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆122Updated 3 years ago
ParCIS / Magicube
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
☆90Updated 3 years ago
nox-410 / tvm.tl
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
☆51Updated last year
owensgroup / merge-spmm
Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018
☆73Updated 5 years ago
uwsampl / SparseTIR
SparseTIR: Sparse Tensor Compiler for Deep Learning
☆141Updated 2 years ago
union-codesign / union
☆14Updated 4 years ago
pku-liang / MAGIS
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)
☆55Updated last year
mit-han-lab / inter-operator-scheduler
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
☆200Updated 3 years ago
awslabs / ratex
☆23Updated 4 months ago
zhaiyi000 / tlp
☆41Updated last year
apuaaChen / EVT_AE
Artifacts of EVT ASPLOS'24
☆28Updated last year
comaniac / epoi
Benchmark PyTorch Custom Operators
☆14Updated 2 years ago
cmu-catalyst / collage
System for automated integration of deep learning backends.
☆47Updated 3 years ago
uwsampl / sparsetir-artifact
Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"
☆25Updated 2 years ago
tlc-pack / tenset
☆92Updated 3 years ago
masahi / tvm-cutlass-eval
☆41Updated 3 years ago
hgyhungry / alcop-artifact
☆23Updated 2 years ago
humuyan / Korch
ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch
☆40Updated 9 months ago
hgyhungry / ge-spmm
☆112Updated 4 years ago
awslabs / lorien
☆42Updated 2 years ago
dgSPARSE / dgSPARSE-Lib
PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity
☆119Updated 3 weeks ago