KAdamek / SMFFTLinks
fast Fourier transform on GPU in shared memory for AstroAccelerate project
☆26Updated 4 years ago
Alternatives and similar repositories for SMFFT
Users that are interested in SMFFT are comparing it to the libraries listed below
Sorting:
- Subset of BLAS routines optimized for NVIDIA GPUs☆69Updated 2 years ago
- Kernel Tuning Toolkit☆60Updated last month
- Benchmarking OpenBLAS on the Apple M1☆18Updated 4 years ago
- Fast matrix multiplication☆29Updated 3 years ago
- BLAS implementation for Intel FPGA☆78Updated 4 years ago
- sparse matrix pre-processing library☆82Updated last year
- QCD for Intel Xeon Phi and Xeon processors☆14Updated last year
- A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037☆41Updated last year
- Shared memory overlap-and-save method for NVIDIA GPUs using CUDA☆16Updated 2 years ago
- MagmaDNN: a simple deep learning framework in c++☆49Updated 4 years ago
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆54Updated 3 months ago
- Next generation library for iterative sparse solvers for ROCm platform☆81Updated this week
- CUDA-based implementation for linear 1D, 2D and 3D FFT-Shift functions.☆22Updated 9 years ago
- FFTX Project☆24Updated this week
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- The SparseX sparse kernel optimization library☆39Updated 6 years ago
- High-Performance Reproducible BLAS using posit arithmetic☆12Updated 3 years ago
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆35Updated 2 months ago
- Next generation FFT implementation for ROCm☆195Updated this week
- PLASMA is a software package for solving problems in dense linear algebra using OpenMP☆31Updated last month
- HiCMA: Hierarchical Computations on Manycore Architectures☆30Updated 2 years ago
- CUDA implementation of the fundamental sum reduce operation. Aims to be as optimized as reasonable.☆37Updated 7 years ago
- CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.☆59Updated 2 years ago
- resources pour le cours d'introduction à la programmation des GPUs du mastère spécialisé HPC-AI☆21Updated last year
- ☆29Updated 2 weeks ago
- ☆40Updated 4 years ago
- A Massively Parallel FFT Library for CPU/GPU☆56Updated 4 years ago
- Multi-dimensional array programming framework for C++ and multi-GPU CUDA applications☆28Updated 8 years ago
- Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware☆16Updated 3 years ago
- Omni Compiler for C and Fortran programs with XcalableMP and OpenACC directives☆61Updated last year