pietrobongini / CUDA-ImageConvolutionLinks

Implementations of 2D Image Convolution algorithm with CUDA (using global memory, shared memory and constant memory)

☆17

Alternatives and similar repositories for CUDA-ImageConvolution

Users that are interested in CUDA-ImageConvolution are comparing it to the libraries listed below

Sorting:

ysh329 / OpenCL-101
Learn OpenCL step by step.
☆138Updated 2 years ago
cjmcv / hpc
Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )
☆60Updated 4 months ago
pigirons / spmv
This is a tuned sparse matrix dense vector multiplication(SpMV) library
☆21Updated 9 years ago
OrangeOwlSolutions / General-CUDA-programming
☆44Updated 7 years ago
chasingegg / Winconv
implementation of winograd minimal convolution algorithm on Intel Architecture
☆39Updated 7 years ago
cwpearson / nvidia-performance-tools
Instructions, Docker images, and examples for Nsight Compute and Nsight Systems
☆131Updated 5 years ago
victusfate / opencl-book-examples
clone of https://code.google.com/p/opencl-book-samples (there's an official repo here https://github.com/bgaster/opencl-book-samples)
☆45Updated 12 years ago
NVlabs / cub
THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.
☆84Updated last year
dorthyluu / cs194-winograd
☆26Updated 8 years ago
JamesTheZ / VersaPipe
A framework for pipelined computing on GPU
☆29Updated 6 years ago
passlab / CUDAMicroBench
☆42Updated last month
weifengliu-ssslab / Benchmark_SpGEMM_using_CSR
CSR-based SpGEMM on nVidia and AMD GPUs
☆46Updated 9 years ago
karakozov / gpudma
GPUDirect example
☆60Updated 3 years ago
lzhengchun / matrix-cuda
matrix multiplication in CUDA
☆123Updated last year
csehydrogen / Winograd-OpenCL
Winograd-based convolution implementation in OpenCL
☆28Updated 8 years ago
CSshengxy / MEC
ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)
☆17Updated 6 years ago
accel-sim / gpu-app-collection
A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.
☆71Updated last week
eegkno / CUDA_by_practice
CUDA by practice
☆129Updated 5 years ago
ROCm / HCC-Example-Application
HCC Sample Applications
☆13Updated 8 years ago
gcoe-dresden / cuda-gpu-tlb
TLB Benchmarks
☆34Updated 7 years ago
rafalk342 / bfs-cuda
Implementation of breadth first search on GPU with CUDA Driver API.
☆51Updated 4 years ago
ekondis / gpumembench
A GPU benchmark suite for assessing on-chip GPU memory bandwidth
☆106Updated 7 years ago
lightsighter / CudaDMA
Emulating DMA Engines on GPUs for Performance and Portability
☆40Updated 10 years ago
lukeyeager / cmake-cuda-example
Example of how to use CUDA with CMake >= 3.8
☆70Updated last month
tlc-pack / tophub
tophub autotvm log collections
☆70Updated 2 years ago
codeplaysoftware / portDNN
portDNN is a library implementing neural network algorithms written using SYCL
☆113Updated last year
xuqiantong / CUDA-Winograd
Fast CUDA Kernels for ResNet Inference.
☆177Updated 6 years ago
AlexeyMalkhanov / Cardiac_demo
project implements minimal functionality for real-time 3D cardiac electrophysiology simulation
☆16Updated 8 years ago
adwaitjog / mafia
MAFIA: Multiple Application Framework for GPU architectures
☆27Updated 3 years ago
andersy005 / tvm-in-action
TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together
☆64Updated 7 years ago