andylolu2 / simpleGEMM

The simplest but fast implementation of matrix multiplication in CUDA.
34Updated 6 months ago

Alternatives and similar repositories for simpleGEMM:

Users that are interested in simpleGEMM are comparing it to the libraries listed below