KAdamek / SMFFTLinks
fast Fourier transform on GPU in shared memory for AstroAccelerate project
☆27Updated 5 years ago
Alternatives and similar repositories for SMFFT
Users that are interested in SMFFT are comparing it to the libraries listed below
Sorting:
- A domain-specific language and compiler for image processing☆77Updated 4 years ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆199Updated 2 weeks ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆76Updated 2 years ago
- Full-speed Array of Structures access☆176Updated 2 years ago
- BLAS implementation for Intel FPGA☆78Updated 5 years ago
- A 128 bit unsigned integer class for CUDA☆46Updated last year
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆81Updated 6 months ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆95Updated 4 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆260Updated last year
- The Surprisingly ParalleL spArse Tensor Toolkit.☆73Updated 3 years ago
- MagmaDNN: a simple deep learning framework in c++☆51Updated 5 years ago
- Multi-dimensional array programming framework for C++ and multi-GPU CUDA applications☆28Updated 9 years ago
- Fast Fast Hadamard Transform☆89Updated 4 years ago
- Multiple 1-stencil implementations using nvidia cuda.☆13Updated 8 years ago
- C++ Header-Only Library for High-Performance Tensor-Vector Multiplication☆23Updated 3 months ago
- Kernel Tuning Toolkit☆67Updated 2 weeks ago
- CLTune: An automatic OpenCL & CUDA kernel tuner☆185Updated 3 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆114Updated last year
- YASK--Yet Another Stencil Kit: a domain-specific language and framework to create high-performance stencil code for implementing finite-d…☆113Updated 6 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆178Updated this week
- CUDA accelerated(X) Multi-Precision library☆93Updated 9 years ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆124Updated 2 weeks ago
- CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.☆59Updated 3 years ago
- Next generation library for iterative sparse solvers for ROCm platform☆94Updated this week
- Concurrent CPU-GPU Programming using Task Models☆106Updated 6 years ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆114Updated 2 weeks ago
- A C++ allocator based on cudaMallocManaged☆23Updated 7 years ago
- Autonomic Performance Environment for eXascale (APEX)☆50Updated 6 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆130Updated 3 weeks ago
- ☆98Updated 9 years ago