☆35Apr 10, 2024Updated last year
Alternatives and similar repositories for ConvStencil
Users that are interested in ConvStencil are comparing it to the libraries listed below
Sorting:
- Fast GPU based tensor core reductions☆13Jan 13, 2023Updated 3 years ago
- Code for High Performance Unstructured SpMM Computation Using Tensor Cores☆33Nov 3, 2024Updated last year
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆91Nov 23, 2022Updated 3 years ago
- ☆10May 12, 2022Updated 3 years ago
- Cavs: An Efficient Runtime System for Dynamic Neural Networks☆15Sep 18, 2020Updated 5 years ago
- Fast Synchronization-Free Algorithms for Parallel Sparse Triangular Solves with Multiple Right-Hand Sides (SpTRSM)☆14Feb 14, 2020Updated 6 years ago
- ☆14Jan 18, 2023Updated 3 years ago
- Mirror of http://gitlab.hpcrl.cse.ohio-state.edu/chong/ppopp19_ae, refactoring for understanding☆15Oct 20, 2021Updated 4 years ago
- ☆18May 14, 2024Updated last year
- New batched algorithm for sparse matrix-matrix multiplication (SpMM)☆16May 7, 2019Updated 6 years ago
- ☆32Jul 2, 2025Updated 8 months ago
- ☆18Oct 15, 2020Updated 5 years ago
- ☆43May 21, 2021Updated 4 years ago
- ☆23Feb 5, 2026Updated 3 weeks ago
- [EuroSys'24] Minuet: Accelerating 3D Sparse Convolutions on GPUs☆80Jun 7, 2024Updated last year
- Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018☆73Oct 5, 2020Updated 5 years ago
- ☆112Jul 3, 2021Updated 4 years ago
- ☆84Dec 2, 2022Updated 3 years ago
- ☆50Jun 27, 2019Updated 6 years ago
- A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves (SpTRSV)☆22Feb 14, 2020Updated 6 years ago
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆51Jul 23, 2024Updated last year
- ☆88May 31, 2025Updated 9 months ago
- Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.☆53Oct 16, 2023Updated 2 years ago
- ☆42Nov 1, 2025Updated 4 months ago
- Artifacts of EVT ASPLOS'24☆29Mar 6, 2024Updated last year
- FlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swa…☆39Oct 5, 2025Updated 4 months ago
- Integrated Training Platform (ITP) traces used in ElasticFlow paper.☆31Dec 23, 2022Updated 3 years ago
- A library for syntactically rewriting Python programs, pronounced (sinner).☆66Feb 22, 2022Updated 4 years ago
- ☆87Updated this week
- ☆36Sep 6, 2013Updated 12 years ago
- ☆166Feb 5, 2026Updated 3 weeks ago
- a size profiler for cuda binary☆72Jan 15, 2026Updated last month
- CUDA GPU implementation of GMRES iterative Solver☆10Apr 16, 2012Updated 13 years ago
- ☆46Jun 19, 2024Updated last year
- Quotes app, built with React Native, GraphQL backend☆11May 17, 2017Updated 8 years ago
- ☆40Feb 28, 2020Updated 6 years ago
- 成大選課小幫手☆10Aug 28, 2015Updated 10 years ago
- LITS: An Optimized Learned Index for Strings☆13Jun 18, 2025Updated 8 months ago
- Official implementation of Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores.☆14Nov 13, 2025Updated 3 months ago