Fast GPU based tensor core reductions
☆13Jan 13, 2023Updated 3 years ago
Alternatives and similar repositories for reduction-tensor-cores
Users that are interested in reduction-tensor-cores are comparing it to the libraries listed below
Sorting:
- ☆35Apr 10, 2024Updated last year
- ☆50Jun 27, 2019Updated 6 years ago
- ☆32Aug 24, 2022Updated 3 years ago
- New batched algorithm for sparse matrix-matrix multiplication (SpMM)☆16May 7, 2019Updated 6 years ago
- ☆43May 21, 2021Updated 4 years ago
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆43Jul 24, 2024Updated last year
- Source code of the PPoPP '22 paper: "TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs" by Y…☆46May 22, 2024Updated last year
- Benchmark implementation of CosmoFlow in TensorFlow Keras☆22Feb 7, 2024Updated 2 years ago
- ☆42Nov 1, 2025Updated 4 months ago
- ☆112Apr 19, 2024Updated last year
- Artifacts of EVT ASPLOS'24☆29Mar 6, 2024Updated 2 years ago
- Create and deploy virtual-experiments - co-processing computational workflows☆10Jan 28, 2026Updated last month
- ☆112Jul 3, 2021Updated 4 years ago
- A GPU algorithm for sparse matrix-matrix multiplication☆75Oct 1, 2020Updated 5 years ago
- ☆53Feb 24, 2026Updated last week
- ext_mpi_collectives☆11Apr 1, 2025Updated 11 months ago
- CUDA GPU implementation of GMRES iterative Solver☆10Apr 16, 2012Updated 13 years ago
- All Resources from Stanford CS106B 2021☆24Jul 11, 2025Updated 7 months ago
- Memory Topology for GPUs☆17Feb 13, 2026Updated 3 weeks ago
- PARADIS, a lightweight and flexible weather forecast model that tries to Keep It Simple.☆26Feb 4, 2026Updated last month
- Some "Formula Translations" for Yousef Saad's book "Iterative Methods for Sparse Linear Systems (2nd Edition)"☆13Jan 14, 2018Updated 8 years ago
- ☆46Jun 19, 2024Updated last year
- ☆36Aug 25, 2023Updated 2 years ago
- ATLAHS: An Application-centric Network Simulator Toolchain for AI, HPC, and Distributed Storage☆71Feb 6, 2026Updated last month
- This project is about convolution operator optimization on GPU, include GEMM based (Implicit GEMM) convolution.☆43Sep 29, 2025Updated 5 months ago
- ☆159Dec 26, 2024Updated last year
- Performance Counter Reader☆11Sep 14, 2022Updated 3 years ago
- ☆11Feb 27, 2024Updated 2 years ago
- 2D time-domain isotropic (visco)elastic FD modeling and full waveform inversion (FWI) code for SH-waves☆13Aug 9, 2020Updated 5 years ago
- Code for paper "Beyond Closure Models: Learning Chaotic Systems via Physics-Informed Neural Operators".☆14Dec 24, 2025Updated 2 months ago
- Quotes app, built with React Native, GraphQL backend☆11May 17, 2017Updated 8 years ago
- OpenMP offload playground☆10Nov 16, 2024Updated last year
- 成大選課小幫手☆10Aug 28, 2015Updated 10 years ago
- GPU based 2D elastic FWI☆12Mar 6, 2018Updated 8 years ago
- How to build an ACP compliant agent that uses MCP as well!☆11May 6, 2025Updated 10 months ago
- ☆10Feb 25, 2026Updated last week
- Argonne Leadership Computing Facility OpenCL tutorial☆10Aug 22, 2025Updated 6 months ago
- EPOCH Input System Version 2☆10Jun 5, 2020Updated 5 years ago
- Assembler and Decompiler for NVIDIA (Maxwell Pascal Volta Turing Ampere) GPUs.☆93Feb 23, 2023Updated 3 years ago