Optimized half precision gemm assembly kernels (deprecated due to ROCm)
☆47Jun 16, 2017Updated 9 years ago
Alternatives and similar repositories for GCNGEMM
Users that are interested in GCNGEMM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implement asm gemm on vega64 for 4096x4096 fp32 matrix☆22Oct 12, 2019Updated 6 years ago
- CUDA GPU implementation of GMRES iterative Solver☆10Apr 16, 2012Updated 14 years ago
- High optimized fft library based on CUDA(the same fast as cufft and faster some times)☆19Jun 13, 2017Updated 9 years ago
- HCC Sample Applications☆13Jan 3, 2017Updated 9 years ago
- Documents and source code related to a Hybrid HPL run for IU's BR2 machine☆16Nov 27, 2012Updated 13 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆14Mar 21, 2019Updated 7 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Jul 7, 2017Updated 8 years ago
- A MXNet implementation of Xception☆20Sep 26, 2017Updated 8 years ago
- Assembler for NVIDIA Maxwell architecture☆1,070Jan 3, 2023Updated 3 years ago
- CMake configurations for PPL projects☆12Aug 10, 2024Updated last year
- MIOpenGEMM is now deprecated☆61Jul 17, 2023Updated 2 years ago
- A highly efficient library for GEMM operations on Sunway TaihuLight☆18Sep 7, 2020Updated 5 years ago
- A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves (SpTRSV)☆23Feb 14, 2020Updated 6 years ago
- Set of basic classes (vector, matrix, images and memory array) for CPU and GPU