KAdamek / GPU_Overlap-and-save_convolutionLinks
Shared memory overlap-and-save method for NVIDIA GPUs using CUDA
☆16Updated 2 years ago
Alternatives and similar repositories for GPU_Overlap-and-save_convolution
Users that are interested in GPU_Overlap-and-save_convolution are comparing it to the libraries listed below
Sorting:
- fast Fourier transform on GPU in shared memory for AstroAccelerate project☆26Updated 4 years ago
- CUDA-based implementation for linear 1D, 2D and 3D FFT-Shift functions.☆22Updated 9 years ago
- ☆40Updated 4 years ago
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆53Updated 3 months ago
- Parallel selection on GPUs☆16Updated 4 years ago
- Some C++ codes for computing a 1D and 2D convolution product using the FFT implemented with the GSL or FFTW☆58Updated 12 years ago
- A GPU based FX correlator for radio astronomy☆35Updated 6 years ago
- High Availability Shared Pipeline Engine☆15Updated last year
- FFTW code optimized for AMD based processors☆54Updated 3 weeks ago
- MPI accelerator-integrated communication extensions☆33Updated 2 years ago
- CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.☆59Updated 2 years ago
- Kernel Tuning Toolkit☆59Updated 3 weeks ago
- Error-Free Transformations as building blocks for compensated algorithms☆15Updated 2 years ago
- ☆44Updated 4 years ago
- A thread safe simple C++ wrapper for FFTW & MKL☆15Updated 3 years ago
- A Massively Parallel FFT Library for CPU/GPU☆56Updated 4 years ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆105Updated 7 years ago
- FFTX Project☆24Updated 2 weeks ago
- CUDA FFT convolution☆15Updated 10 years ago
- Fast matrix multiplication☆29Updated 3 years ago
- Next generation library for iterative sparse solvers for ROCm platform☆81Updated last week
- AstroAccelerate is a many-core accelerated software package for processing time-domain radio-astronomy data.☆46Updated 4 months ago
- FFT implementation based on FFTPack, but with several improvements, cloned from☆26Updated last year
- MagmaDNN: a simple deep learning framework in c++☆49Updated 4 years ago
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆35Updated 9 years ago
- A demo of Fast Fourier transform in CUDA implementing by cooleytukey and stockham method☆8Updated 7 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- ☆15Updated 4 years ago
- Nonuniform fast Fourier transforms of types 1 and 2, in 1D, 2D, and 3D, on the GPU☆87Updated last year
- Software to support people learning OpenMP with our book ... The OpenMP Common Core: Making OpenMP Simple Again☆82Updated last year