Liu-xiandong / How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
889Updated last year

Alternatives and similar repositories for How_to_optimize_in_GPU:

Users that are interested in How_to_optimize_in_GPU are comparing it to the libraries listed below