hgomersall / SSE-convolution
A demonstration of speeding up a 1D convolution using SSE
☆49Updated 8 years ago
Related projects: ⓘ
- Some C++ codes for computing a 1D and 2D convolution product using the FFT implemented with the GSL or FFTW☆57Updated 11 years ago
- Vectorizable implementations of some mathematical functions☆102Updated 4 years ago
- STL-Compatible Lemire-Fenn algorithm for running min/max☆26Updated 4 years ago
- fast log and exp functions for x86/x64 SSE☆219Updated 3 weeks ago
- A portable high-level API with CUDA or OpenCL back-end☆53Updated 6 years ago
- Automatically exported from code.google.com/p/math-neon☆38Updated 9 years ago
- Template based C++11 FFT implementation.☆53Updated 9 years ago
- Flexible Library for Efficient Numerical Solutions☆126Updated 2 years ago
- ☆67Updated this week
- a software library containing Sparse functions written in OpenCL☆173Updated 4 years ago
- FFT (Fast Fourier Transform): SSE, AVX, AVX2☆50Updated 8 years ago
- Library for fast image convolution in neural networks on Intel Architecture☆29Updated 7 years ago
- NumPy-compatible multidimensional arrays in C++☆160Updated last year
- ☆67Updated 2 years ago
- ☆49Updated 4 years ago
- Execution primitives for C++☆154Updated 4 years ago
- Boost SIMD☆232Updated 5 years ago
- Implementation of the SYCL specification.☆68Updated 3 months ago
- Code samples☆62Updated 3 years ago
- A machine vision library written in SYCL and C++ that shows performance-portable implementation of graph algorithms☆160Updated 5 months ago
- UME::SIMD A library for explicit simd vectorization.☆90Updated 6 years ago
- an OpenCL based software library containing random number generation functions☆132Updated 2 years ago
- A C++ library which allows the numerical optimisation of any given problem, function, program or you-name-it☆45Updated 6 years ago
- CMake BASIS makes it easy to create sharable software and libraries that work together. This is accomplished by combining and documenting…☆48Updated 3 years ago
- Easy to run kernels using OpenCL☆183Updated 6 years ago
- Implementation of FIR and IIR filters optimized for SIMD processing☆48Updated 7 years ago
- CMake module collection☆29Updated 9 years ago
- A Light-weight and Fast Template Matrix Library☆131Updated 11 years ago
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆76Updated 3 years ago
- Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.☆70Updated 8 years ago