chrischoy / CUDA-FFT-ConvolutionLinks

CUDA FFT convolution

☆16

Alternatives and similar repositories for CUDA-FFT-Convolution

Users that are interested in CUDA-FFT-Convolution are comparing it to the libraries listed below

Sorting:

naibaf7 / libdnn
Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL
☆137Updated 8 years ago
NVIDIA / cnmem
A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory
☆298Updated 6 years ago
ColfaxResearch / FALCON
Library for fast image convolution in neural networks on Intel Architecture
☆31Updated 8 years ago
strin / gemm-android
tutorial to optimize GEMM performance on android
☆51Updated 9 years ago
NVIDIA / kmeans
kmeans clustering with multi-GPU capabilities
☆119Updated 2 years ago
hyln9 / GCNGEMM
Optimized half precision gemm assembly kernels (deprecated due to ROCm)
☆47Updated 8 years ago
linnanwang / BLASX
a heterogeneous multiGPU level-3 BLAS library
☆46Updated 5 years ago
jeremyfix / FFTConvolution
Some C++ codes for computing a 1D and 2D convolution product using the FFT implemented with the GSL or FFTW
☆59Updated 12 years ago
OpenMathLib / OpenVML
Vector Math Library
☆82Updated last month
zhiqi-0 / RDMA-MXNet-ps-lite
RDMA Optimization on MXNet
☆14Updated 7 years ago
zhxfl / CUDA-CNN
CNN accelerated by cuda. Test on mnist and finilly get 99.76%
☆185Updated 7 years ago
bryancatanzaro / kmeans
kmeans
☆55Updated 9 years ago
springer13 / hptt
High-Performance Tensor Transpose library
☆205Updated 2 years ago
gujunli / OpenCL-caffe
OpenCL version of caffe
☆18Updated 9 years ago
cudpp / cudpp
CUDA Data Parallel Primitives Library
☆434Updated 6 years ago
ctuning / ck-tensorrt
Collective Knowledge repository for NVIDIA's TensorRT
☆37Updated 4 years ago
eBay / maxDNN
High Efficiency Convolution Kernel for Maxwell GPU Architecture
☆136Updated 8 years ago
hannes-brt / cudnn-python-wrappers
Python wrappers for the NVIDIA cuDNN libraries
☆141Updated 8 years ago
CNugteren / myGEMM
Code appendix to an OpenCL matrix-multiplication tutorial
☆178Updated 8 years ago
dmlc / MXNet.cpp
C++ interface for mxnet
☆115Updated 8 years ago
lukeyeager / cmake-cuda-example
Example of how to use CUDA with CMake >= 3.8
☆70Updated 4 months ago
intel / ideep
Intel® Optimization for Chainer*, a Chainer module providing numpy like API and DNN acceleration using MKL-DNN.
☆172Updated this week
deeplearningais / CUV
Matrix library for CUDA in C++ and Python
☆196Updated 8 years ago
tqchen / mshadow
Matrix Shadow:Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning
☆33Updated 8 years ago
milakov / nnForge
Convolutional neural networks C++ framework with CPU and GPU (CUDA) backends
☆182Updated 6 years ago
CNugteren / CLTune
CLTune: An automatic OpenCL & CUDA kernel tuner
☆182Updated 2 years ago
tbennun / cudnn-training
A CUDNN minimal deep learning training code sample using LeNet.
☆268Updated 2 years ago
OAID / MXNet-HRT
Heterogeneous Run Time version of MXNet. Added heterogeneous capabilities to the MXNet, uses heterogeneous computing infrastructure frame…
☆72Updated 7 years ago
ahmetaa / fast-dnn
A fast deep neural network library (CPU) for speech recognition
☆84Updated 6 years ago
MKLab-ITI / CUDA
GPU-accelerated LIBSVM is a modification of the original LIBSVM that exploits the CUDA framework to significantly reduce processing time …
☆218Updated 8 years ago