hgomersall / SSE-convolutionLinks
A demonstration of speeding up a 1D convolution using SSE
☆51Updated 8 years ago
Alternatives and similar repositories for SSE-convolution
Users that are interested in SSE-convolution are comparing it to the libraries listed below
Sorting:
- Some C++ codes for computing a 1D and 2D convolution product using the FFT implemented with the GSL or FFTW☆58Updated 12 years ago
- Template based C++11 FFT implementation.☆53Updated 10 years ago
- Library for fast image convolution in neural networks on Intel Architecture☆29Updated 8 years ago
- Automatically exported from code.google.com/p/math-neon☆40Updated 10 years ago
- Vectorizable implementations of some mathematical functions☆103Updated 5 years ago
- STL-Compatible Lemire-Fenn algorithm for running min/max☆27Updated 5 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- CMake module collection☆30Updated 10 years ago
- fast log and exp functions for AVX2/AVX-512☆231Updated 3 months ago
- A simple example of performing a one-dimensional discrete convolution using the FFTW library.☆15Updated 10 years ago
- Implementation of FIR and IIR filters optimized for SIMD processing☆49Updated 8 years ago
- Blazing-fast Expression Templates Library (ETL) with GPU support, in C++☆226Updated 3 weeks ago
- a software library containing Sparse functions written in OpenCL☆175Updated 5 years ago
- FFT (Fast Fourier Transform): SSE, AVX, AVX2☆51Updated 8 years ago
- Generative Fast Fourier Transforms in C++ using template metaprogramming☆10Updated 9 years ago
- Launching collective tasks in bulk☆37Updated 5 years ago
- Range-based for loops to iterate over a range of numbers or values☆35Updated 8 years ago
- Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.☆70Updated 9 years ago
- Polymorphic multidimensional array view☆36Updated 5 years ago
- Fast Runtime-Flexible Multi-dimensional Arrays and Views for C++☆48Updated 2 years ago
- Flexible Library for Efficient Numerical Solutions☆127Updated 2 weeks ago
- A Light-weight and Fast Template Matrix Library☆132Updated 12 years ago
- Full-speed Array of Structures access☆171Updated 2 years ago
- ☆68Updated 2 years ago
- Generalized Histograms for CUDA-capable GPUs☆42Updated 9 years ago
- NumPy-compatible multidimensional arrays in C++☆161Updated 8 months ago
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 5 years ago
- A machine vision library written in SYCL and C++ that shows performance-portable implementation of graph algorithms☆161Updated last year
- an OpenCL based software library containing random number generation functions☆136Updated 3 years ago
- WIP · CUDA compatibility for Blaze · https://bitbucket.org/blaze-lib/blaze☆18Updated 5 years ago