debowin / cuda-tiled-2D-convolutionLinks

Optimized Parallel Tiled Approach to perform 2D Convolution by taking advantage of the lower latency, higher bandwidth shared memory as well as global constant memory cached aggresively within GPU thread blocks.
14Updated 7 years ago

Alternatives and similar repositories for cuda-tiled-2D-convolution

Users that are interested in cuda-tiled-2D-convolution are comparing it to the libraries listed below

Sorting: