KAdamek / SMFFTLinks
fast Fourier transform on GPU in shared memory for AstroAccelerate project
☆27Updated 5 years ago
Alternatives and similar repositories for SMFFT
Users that are interested in SMFFT are comparing it to the libraries listed below
Sorting:
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆198Updated this week
- Subset of BLAS routines optimized for NVIDIA GPUs☆74Updated 2 years ago
- Next generation library for iterative sparse solvers for ROCm platform☆89Updated this week
- A 128 bit unsigned integer class for CUDA☆46Updated 10 months ago
- Kernel Tuning Toolkit☆65Updated 2 weeks ago
- CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.☆59Updated 3 years ago
- Full-speed Array of Structures access☆176Updated 2 years ago
- MagmaDNN: a simple deep learning framework in c++☆51Updated 5 years ago
- The SparseX sparse kernel optimization library☆43Updated 6 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆113Updated last year
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆115Updated last week
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆37Updated 9 years ago
- A domain-specific language and compiler for image processing☆77Updated 4 years ago
- tools to create performance and roofline plots from measured data☆60Updated 11 years ago
- The Surprisingly ParalleL spArse Tensor Toolkit.☆73Updated 3 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆260Updated 10 months ago
- An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluations☆18Updated 5 years ago
- QCD for Intel Xeon Phi and Xeon processors☆14Updated last year
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆178Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆124Updated last week
- Autonomic Performance Environment for eXascale (APEX)☆49Updated 4 months ago
- YASK--Yet Another Stencil Kit: a domain-specific language and framework to create high-performance stencil code for implementing finite-d…☆110Updated 4 months ago
- C++ Header-Only Library for High-Performance Tensor-Vector Multiplication☆23Updated 3 weeks ago
- ☆48Updated 5 years ago
- Multi-dimensional array programming framework for C++ and multi-GPU CUDA applications☆28Updated 9 years ago
- A unified framework across multiple programming platforms☆42Updated 6 months ago
- bhSPARSE: A Sparse BLAS Library☆16Updated 10 years ago
- Examples for HIP☆212Updated 11 months ago
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆48Updated 10 years ago
- Short examples illustrating AVX2 intrinsics for simple tasks.☆98Updated last year