jeremyfix / FFTConvolution
Some C++ codes for computing a 1D and 2D convolution product using the FFT implemented with the GSL or FFTW
☆57Updated 11 years ago
Related projects ⓘ
Alternatives and complementary repositories for FFTConvolution
- A demonstration of speeding up a 1D convolution using SSE☆49Updated 8 years ago
- Library for fast image convolution in neural networks on Intel Architecture☆29Updated 7 years ago
- Automatically exported from code.google.com/p/math-neon☆38Updated 9 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- tutorial to optimize GEMM performance on android☆51Updated 8 years ago
- A machine vision library written in SYCL and C++ that shows performance-portable implementation of graph algorithms☆160Updated 7 months ago
- CNNs in Halide☆23Updated 9 years ago
- OpenCL for Nets - A Deep Learning Framework based on OpenCL, written by C++. Supports popular MLP, RNN(LSTM), CNN(ResNet). Friendly debug…☆66Updated 5 years ago
- Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL☆135Updated 7 years ago
- Just my local copy of math-neon with build script☆91Updated 6 years ago
- Vector Math Library☆75Updated 7 years ago
- A stub opecl library that dynamically dlopen/dlsyms opencl implementations at runtime based on environment variables. Will be useful when…☆67Updated 8 months ago
- Automatically exported from code.google.com/p/opencl-book-samples☆162Updated 5 years ago
- Example code used in the CVPR 2015 tutorial☆39Updated 9 years ago
- ONNX Parser is a tool that automatically generates openvx inference code (CNN) from onnx binary model files.☆17Updated 5 years ago
- Corrected source for the OpenCL in Action book (work in progress)☆61Updated 11 years ago
- Implementation of the Guided Image Filtering algorithm in OpenCL☆48Updated 3 years ago
- C99/C++ header-only library for division via fixed-point multiplication by inverse☆49Updated 7 months ago
- Easy to run kernels using OpenCL☆183Updated 6 years ago
- Portable 128-bit SIMD intrinsics☆57Updated last year
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 7 years ago
- BLAS OpenCL implementation.☆15Updated 9 years ago
- ☆67Updated 2 years ago
- A GPU implementation of the Wavelet Transform☆69Updated 4 years ago
- Set of basic classes (vector, matrix, images and memory array) for CPU and GPU☆17Updated 3 years ago
- Communication-Minimizing 2D Convolution in GPU Registers☆30Updated 11 years ago
- a software library containing Sparse functions written in OpenCL☆173Updated 4 years ago
- Proof-of-Concept CNN in Halide☆21Updated 8 years ago
- Generalized Histograms for CUDA-capable GPUs☆43Updated 9 years ago