KAdamek / SMFFT
fast Fourier transform on GPU in shared memory for AstroAccelerate project
☆26Updated 4 years ago
Alternatives and similar repositories for SMFFT:
Users that are interested in SMFFT are comparing it to the libraries listed below
- Shared memory overlap-and-save method for NVIDIA GPUs using CUDA☆16Updated 2 years ago
- Kernel Tuning Toolkit☆59Updated last month
- ☆40Updated 3 years ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆68Updated 2 years ago
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆53Updated last month
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- A C++ allocator based on cudaMallocManaged☆23Updated 6 years ago
- FFTX Project☆24Updated 4 months ago
- sparse matrix pre-processing library☆81Updated 11 months ago
- A Massively Parallel FFT Library for CPU/GPU☆56Updated 4 years ago
- A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037☆41Updated last year
- CUDA-based implementation for linear 1D, 2D and 3D FFT-Shift functions.☆22Updated 9 years ago
- Distributed Performance-portable Stencil Compuitation☆10Updated last year
- Next generation library for iterative sparse solvers for ROCm platform☆79Updated this week
- C++ Header-Only Library for High-Performance Tensor-Vector Multiplication☆21Updated 4 months ago
- MPI accelerator-integrated communication extensions☆33Updated 2 years ago
- Multi-dimensional array programming framework for C++ and multi-GPU CUDA applications☆28Updated 8 years ago
- The SparseX sparse kernel optimization library☆40Updated 6 years ago
- Parallel nonequispaced fast Fourier transforms☆16Updated 6 years ago
- resources pour le cours d'introduction à la programmation des GPUs du mastère spécialisé HPC-AI☆22Updated last year
- HiCMA: Hierarchical Computations on Manycore Architectures☆30Updated 2 years ago
- List all available information about all SYCL devices and platforms☆15Updated 4 years ago
- Fast matrix multiplication☆29Updated 3 years ago
- C++ Template Linear Algebra PACKage☆43Updated this week
- Error-Free Transformations as building blocks for compensated algorithms☆15Updated 2 years ago
- Tensor Contraction Code Generator☆37Updated 7 years ago
- High-Performance Machine Learning Primitives☆12Updated 4 years ago
- A thread safe simple C++ wrapper for FFTW & MKL☆15Updated 3 years ago
- The fftMPI library performs 2d/3d FFTs in parallel for grids distributed across MPI processes.☆14Updated 2 years ago
- MATLAB Code for Parameters of Floating-Point Arithmetics☆8Updated 3 years ago