Shigangli / Magicube

Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.

☆81

Related projects ⓘ

Alternatives and complementary repositories for Magicube

apuaaChen / vectorSparse
☆31Updated 2 years ago
HPMLL / DTC-SpMM_ASPLOS24
☆24Updated 5 months ago
pku-liang / AMOS
Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators
☆103Updated 2 years ago
sunlex0717 / DissectingTensorCores
☆80Updated 7 months ago
c3sr / tcu_scope
☆44Updated 5 years ago
parasailteam / coconet
☆73Updated last year
LucasWilkinson / ASpT-mirror
Mirror of http://gitlab.hpcrl.cse.ohio-state.edu/chong/ppopp19_ae, refactoring for understanding
☆13Updated 3 years ago
CRAFT-THU / RoDe
A Row Decomposition-based Approach for Sparse Matrix Multiplication on GPUs
☆12Updated 11 months ago
AlibabaResearch / mononn
☆19Updated 4 months ago
pku-liang / MAGIS
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)
☆43Updated 5 months ago
microsoft / ConvStencil
☆24Updated 7 months ago
zhaiyi000 / tlp
☆41Updated 6 months ago
uwsampl / sparsetir-artifact
Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"
☆23Updated last year
codyjrivera / tsm2x-imp
Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA
☆31Updated 4 years ago
sjfeng1999 / gpu-arch-microbenchmark
Dissecting NVIDIA GPU Architecture
☆82Updated 2 years ago
SJTU-ReArch-Group / Paper-Reading-List
☆82Updated this week
hgyhungry / ShflBW_Sparse_NN
☆15Updated last year
uwsampl / SparseTIR
SparseTIR: Sparse Tensor Compiler for Deep Learning
☆131Updated last year
SuperScientificSoftwareLaboratory / TileSpGEMM
Source code of the PPoPP '22 paper: "TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs" by Y…
☆38Updated 5 months ago
pku-liang / TileFlow
TileFlow is a performance analysis tool based on Timeloop for fusion dataflows
☆55Updated 7 months ago
YukeWang96 / QGTC_PPoPP22
Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.
☆27Updated 2 years ago
apuaaChen / EVT_AE
Artifacts of EVT ASPLOS'24
☆16Updated 8 months ago
lixiuhong / batched_gemm
☆38Updated 4 years ago
UDC-GAC / venom
A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
☆43Updated 11 months ago
LeiWang1999 / tvm_gpu_gemm
play gemm with tvm
☆84Updated last year
UofT-EcoSystem / DietCode
DietCode Code Release
☆62Updated 2 years ago
YukeWang96 / MGG_OSDI23
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…
☆37Updated 8 months ago
nox-410 / tvm.tl
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
☆49Updated 3 months ago
SuperScientificSoftwareLaboratory / DASP
Source code of the SC '23 paper: "DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multipli…
☆18Updated 5 months ago
PrincetonUniversity / LLMCompass
☆84Updated 4 months ago