codeplaysoftware / portFFT
portFFT is a library implementing Fast Fourier Transforms using SYCL
☆17Updated last week
Alternatives and similar repositories for portFFT:
Users that are interested in portFFT are comparing it to the libraries listed below
- A pseudo random number generator library written against the SYCL API.☆12Updated 5 years ago
- Synchronous, single-threaded, library-only SYCL implementation for debugging and verification.☆35Updated last month
- Codeplay project for contributions to the LLVM SYCL implementation☆30Updated 4 years ago
- List all available information about all SYCL devices and platforms☆15Updated 4 years ago
- Examples for using SYCL on CUDA☆62Updated last week
- SYCL Reference Manual☆27Updated 10 months ago
- AMD’s C++ library for accelerating tensor primitives☆38Updated this week
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- SYCL materials for ENCCS workshop☆25Updated last year
- SYCL Benchmark Suite☆63Updated 3 weeks ago
- A collection of samples written using the SYCL standard for C++.☆18Updated this week
- FFT implementation based on FFTPack, but with several improvements, cloned from☆24Updated 9 months ago
- ☆43Updated this week
- ☆15Updated 2 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- ☆17Updated last year
- SYCL Conformance Tests☆68Updated this week
- Library for length agnostic SIMD intrinsic support and the corresponding math operations☆20Updated 3 years ago
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆84Updated 2 weeks ago
- SYCL Open Source Specification☆130Updated last week
- The C++ Standard Library for your entire system.☆15Updated last month
- A shared-memory FFT for the Kokkos ecosystem☆31Updated last week
- BLAS++ is a C++ wrapper around CPU and GPU BLAS (basic linear algebra subroutines), developed as part of the SLATE project.☆76Updated last week
- Tensor Tiling Library☆34Updated last week
- Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceler…☆28Updated 8 months ago
- Header-only C++20 wrapper for MPI 4.0.☆44Updated last year
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆50Updated last year
- FFTW code optimized for AMD based processors☆50Updated 5 months ago
- DLA-Future☆70Updated this week
- SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) sy…☆110Updated 2 months ago