KAdamek / GPU_Overlap-and-save_convolution
Shared memory overlap-and-save method for NVIDIA GPUs using CUDA
☆16Updated 2 years ago
Alternatives and similar repositories for GPU_Overlap-and-save_convolution:
Users that are interested in GPU_Overlap-and-save_convolution are comparing it to the libraries listed below
- fast Fourier transform on GPU in shared memory for AstroAccelerate project☆26Updated 4 years ago
- ☆36Updated 3 years ago
- CUDA-based implementation for linear 1D, 2D and 3D FFT-Shift functions.☆22Updated 9 years ago
- Kernel Tuning Toolkit☆55Updated 2 months ago
- A Massively Parallel FFT Library for CPU/GPU☆54Updated 4 years ago
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆48Updated last week
- CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.☆59Updated 2 years ago
- CUDA Based De-dispersion library☆11Updated 7 months ago
- A GPU based FX correlator for radio astronomy☆35Updated 6 years ago
- High Availability Shared Pipeline Engine☆15Updated last year
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- The SparseX sparse kernel optimization library☆39Updated 6 years ago
- Example code for Intel AVX / AVX2 intrinsics.☆128Updated last year
- Some C++ codes for computing a 1D and 2D convolution product using the FFT implemented with the GSL or FFTW☆58Updated 11 years ago
- Fast Fourier Transform implementation, computable on CUDA platform. Seminar project for MI-PRC course at FIT CTU.☆37Updated last year
- MATLAB Code for Parameters of Floating-Point Arithmetics☆9Updated 2 years ago
- The fftMPI library performs 2d/3d FFTs in parallel for grids distributed across MPI processes.☆14Updated 2 years ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆67Updated last year
- QCD for Intel Xeon Phi and Xeon processors☆14Updated 9 months ago
- MagmaDNN: a simple deep learning framework in c++☆48Updated 4 years ago
- A thread safe simple C++ wrapper for FFTW & MKL☆15Updated 3 years ago
- ☆42Updated 4 years ago
- AstroAccelerate is a many-core accelerated software package for processing time-domain radio-astronomy data.☆44Updated 3 months ago
- ☆31Updated 4 years ago
- portFFT is a library implementing Fast Fourier Transforms using SYCL☆16Updated 2 weeks ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆47Updated last year
- Parallel nonequispaced fast Fourier transforms☆16Updated 6 years ago
- Serial and parallel implementations of matrix multiplication☆39Updated 3 years ago
- Volume-integral-equation solver for electromagnetic scattering and non-equilibrium fluctuational electrodynamics.☆22Updated 8 years ago