KAdamek / GPU_Overlap-and-save_convolution
Shared memory overlap-and-save method for NVIDIA GPUs using CUDA
☆16Updated 2 years ago
Alternatives and similar repositories for GPU_Overlap-and-save_convolution
Users that are interested in GPU_Overlap-and-save_convolution are comparing it to the libraries listed below
Sorting:
- fast Fourier transform on GPU in shared memory for AstroAccelerate project☆26Updated 4 years ago
- CUDA-based implementation for linear 1D, 2D and 3D FFT-Shift functions.☆22Updated 9 years ago
- ☆40Updated 3 years ago
- CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.☆59Updated 2 years ago
- A GPU based FX correlator for radio astronomy☆35Updated 6 years ago
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆53Updated 2 months ago
- High Availability Shared Pipeline Engine☆15Updated last year
- Kernel Tuning Toolkit☆59Updated 2 months ago
- FFTW code optimized for AMD based processors☆54Updated last week
- OpenMPL (Open Math Performance Library) is an open source math libraries, including BLAS, LAPACK, FFT, VML, and others.☆18Updated last year
- Next generation library for iterative sparse solvers for ROCm platform☆81Updated this week
- resources pour le cours d'introduction à la programmation des GPUs du mastère spécialisé HPC-AI☆22Updated last year
- Volume-integral-equation solver for electromagnetic scattering and non-equilibrium fluctuational electrodynamics.☆23Updated 8 years ago
- Benchmark Suite for Heterogenuous FFT Implementations☆35Updated last year
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- A thread safe simple C++ wrapper for FFTW & MKL☆15Updated 3 years ago
- A Task-based Library for Solving Dense Nonsymmetric Eigenvalue Problems☆23Updated 2 years ago
- choosing FFT library...☆149Updated 2 years ago
- StarPU Runtime system☆15Updated 14 years ago
- CAKE Library for constant-bandwidth matrix multiplication on CPUs☆15Updated last year
- Subset of BLAS routines optimized for NVIDIA GPUs☆68Updated 2 years ago
- Examples for using SYCL on CUDA☆62Updated 2 months ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- QCD for Intel Xeon Phi and Xeon processors☆14Updated last year
- A GPU performance prediction toolkit for CUDA programs☆16Updated 6 years ago
- rocWMMA☆111Updated this week
- Error-Free Transformations as building blocks for compensated algorithms☆15Updated 2 years ago
- Parallel nonequispaced fast Fourier transforms☆16Updated 6 years ago
- Multiple-precision GPU accelerated linear algebra routines (dense and sparse) based on residue number system☆18Updated 2 years ago
- ROCm Systems Profiler☆17Updated this week