zheng-ningxin / SparTALinks

☆9

Alternatives and similar repositories for SparTA

Users that are interested in SparTA are comparing it to the libraries listed below

Sorting:

nox-410 / tvm.tl
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
☆50Updated last year
TiledTensor / TiledLower
TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.
☆14Updated 8 months ago
apuaaChen / EVT_AE
Artifacts of EVT ASPLOS'24
☆26Updated last year
jiazhihao / attention_superoptimizer
An Attention Superoptimizer
☆22Updated 6 months ago
LeiWang1999 / Stream-k.tvm
☆19Updated 10 months ago
google / iopddl
Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning
☆23Updated 2 months ago
UofT-EcoSystem / DietCode
DietCode Code Release
☆64Updated 3 years ago
tile-ai / tilescale
Tile-based language built for AI computation across all scales
☆31Updated this week
TiledTensor / TiledKernel
TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.
☆19Updated last year
thu-pacman / PET
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆122Updated 3 years ago
zhuohan123 / terapipe
☆75Updated 4 years ago
parasailteam / coconet
☆80Updated 2 years ago
hgyhungry / alcop-artifact
☆23Updated 2 years ago
tsinghua-ideal / Canvas
Canvas: End-to-End Kernel Architecture Search in Neural Networks
☆27Updated 8 months ago
HPMLL / NVIDIA-Hopper-Benchmark
☆51Updated 2 months ago
pku-liang / MAGIS
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)
☆53Updated last year
xxyux / SpInfer
SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs
☆50Updated 4 months ago
humuyan / Korch
ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch
☆38Updated 4 months ago
awslabs / lorien
☆43Updated last year
microsoft / FractalTensor
FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …
☆28Updated 7 months ago
ceruleangu / Block-Sparse-Benchmark
Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.
☆24Updated 4 years ago
awslabs / ratex
☆23Updated 8 months ago
sjtu-epcc / Tacker
Tacker: Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS
☆31Updated 5 months ago
ParCIS / Magicube
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
☆89Updated 2 years ago
GVProf / GVProf
GVProf: A Value Profiler for GPU-based Clusters
☆51Updated last year
zhisbug / Cavs
Cavs: An Efficient Runtime System for Dynamic Neural Networks
☆14Updated 4 years ago
SJTU-IPADS / disb
DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.
☆53Updated 11 months ago
tile-ai / TileOPs
☆44Updated last week
heheda12345 / MagPy
☆39Updated last year
zhaiyi000 / tlp
☆41Updated last year