hgyhungry / ShflBW_Sparse_NNView external linksLinks
☆16Nov 22, 2022Updated 3 years ago
Alternatives and similar repositories for ShflBW_Sparse_NN
Users that are interested in ShflBW_Sparse_NN are comparing it to the libraries listed below
Sorting:
- A Row Decomposition-based Approach for Sparse Matrix Multiplication on GPUs☆28Nov 29, 2023Updated 2 years ago
- Source code of the PPoPP '22 paper: "TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs" by Y…☆46May 22, 2024Updated last year
- Repository for artifact evaluation of ASPLOS 2023 paper "SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning"☆25Feb 24, 2023Updated 2 years ago
- PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity☆121Dec 22, 2025Updated last month
- Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.☆53Oct 16, 2023Updated 2 years ago
- ☆112Jul 3, 2021Updated 4 years ago
- Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.☆30Feb 12, 2022Updated 4 years ago
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆91Nov 23, 2022Updated 3 years ago
- ☆32Aug 24, 2022Updated 3 years ago
- Sparse kernels for GNNs based on TVM☆17Nov 18, 2020Updated 5 years ago
- Code for High Performance Unstructured SpMM Computation Using Tensor Cores☆32Nov 3, 2024Updated last year
- ☆24Mar 15, 2023Updated 2 years ago
- A GPU algorithm for sparse matrix-matrix multiplication☆75Oct 1, 2020Updated 5 years ago
- Implementation of FusedMM method for IPDPS 2021 paper titled "FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural N…☆31Aug 12, 2022Updated 3 years ago
- ☆35Dec 22, 2025Updated last month
- CUDA GPU implementation of GMRES iterative Solver☆10Apr 16, 2012Updated 13 years ago
- Some "Formula Translations" for Yousef Saad's book "Iterative Methods for Sparse Linear Systems (2nd Edition)"☆13Jan 14, 2018Updated 8 years ago
- Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018☆73Oct 5, 2020Updated 5 years ago
- ☆70Jun 16, 2021Updated 4 years ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆142Mar 31, 2023Updated 2 years ago
- ☆45Jun 19, 2024Updated last year
- ☆164Jul 22, 2024Updated last year
- 面向多平台编译优化的深度学习中间表示☆10Oct 28, 2024Updated last year
- Research simulation toolkit for federated learning☆13Nov 7, 2020Updated 5 years ago
- ☆40Feb 28, 2020Updated 5 years ago
- This GPU based computer vision software tracks a color-grid pattern at ~60Hz using an Android phone for effective position tracking in VR…☆13Feb 14, 2018Updated 7 years ago
- libFastMesh - Optimized Finite Volume Computational Aeroacoustics (CAA) Code☆13Mar 28, 2024Updated last year
- 基于FP16的二维脉动阵列电路设计☆13Feb 23, 2023Updated 2 years ago
- ☆12Jan 13, 2023Updated 3 years ago
- ☆12Sep 18, 2024Updated last year
- OpenFOAM right wmake at the right time☆11Mar 10, 2019Updated 6 years ago
- A compact and extensible image viewer☆11Jun 22, 2020Updated 5 years ago
- Code for "Adaptive Self-improvement LLM Agentic System for ML Library Development" (ICML 2025)☆15Jan 6, 2026Updated last month
- An improved version of `w`☆14Mar 16, 2017Updated 8 years ago
- 基于Xilinx FPGA的通用型 CNN卷积神经网络加速器,本设计基于KV260板卡,MpSoC架构均可移植☆18Dec 13, 2024Updated last year
- ☆12Nov 22, 2022Updated 3 years ago
- ☆12Jan 19, 2020Updated 6 years ago
- GPU implementation of Winograd convolution☆10Oct 23, 2017Updated 8 years ago
- ☆13Jan 18, 2020Updated 6 years ago