dian-lun-lin / SNIG
SNIG: Accelerated Large Sparse Neural Network Inference using Task Graph Parallelism
☆34Updated 3 years ago
Alternatives and similar repositories for SNIG:
Users that are interested in SNIG are comparing it to the libraries listed below
- Artifact of ASPLOS'23 paper entitled: GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference☆17Updated last year
- TLB Benchmarks☆33Updated 7 years ago
- An extension library of WMMA API (Tensor Core API)☆87Updated 6 months ago
- Heterogeneous Programming☆17Updated last year
- 🎃 GPU load-balancing library for regular and irregular computations.☆59Updated 7 months ago
- CUDA PTX-ISA Document 中文翻译版☆32Updated last month
- ☆46Updated 5 years ago
- ☆48Updated 5 years ago
- ☆31Updated 2 years ago
- ☆40Updated 4 years ago
- Evaluating different memory managers for dynamic GPU memory☆24Updated 4 years ago
- Multi-GPU dynamic scheduler using PGAS style cross-GPU communication☆28Updated last year
- ☆84Updated 9 months ago
- MLIR Sample dialect☆108Updated last week
- ☆38Updated 4 years ago
- Concurrent CPU-GPU Programming using Task Models☆100Updated 5 years ago
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆85Updated 2 years ago
- A language and compiler for irregular tensor programs.☆134Updated 2 months ago
- Dissecting NVIDIA GPU Architecture☆84Updated 2 years ago
- An MLIR-based toy DL compiler for TVM Relay.☆55Updated 2 years ago
- Bridging polyhedral analysis tools to the MLIR framework☆107Updated last year
- development repository for the open earth compiler☆79Updated 3 years ago
- Conversions to MLIR EmitC☆126Updated last month
- BGHT: High-performance static GPU hash tables.☆57Updated 4 months ago
- ☆40Updated this week
- ❤️ CUDA/C++ GPU graph analytics simplified.☆31Updated 2 years ago
- LonestarGPU: Irregular algorithms parallelized for GPUs☆33Updated 5 years ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆127Updated last year
- TileFlow is a performance analysis tool based on Timeloop for fusion dataflows☆55Updated 9 months ago