Gram21 / GPUSorting
Implementation of a few sorting algorithms in OpenCL
☆35Updated 5 years ago
Alternatives and similar repositories for GPUSorting:
Users that are interested in GPUSorting are comparing it to the libraries listed below
- Simple example of using Vulkan for GPGPU computing☆53Updated 6 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- ☆68Updated 2 years ago
- Giddy - A lightweight GPU decompression library☆42Updated 5 years ago
- Implementation of the SYCL specification.☆66Updated 9 months ago
- OpenCL for Visual Studio Code☆42Updated 6 months ago
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆76Updated 4 years ago
- Experimental ranges for CUDA☆24Updated 6 years ago
- Vectorization EDSL library☆15Updated 5 years ago
- Computing Language Utility☆72Updated 8 years ago
- SuiteSparse: a suite of sparse matrix packages by @DrTimothyAldenDavis et al. with native CMake support☆53Updated 8 months ago
- C++ convenience classes to be used with CUDA code, for both the host and the kerlel parts.☆55Updated 6 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆111Updated 10 months ago
- Whippletree, a novel approach to scheduling dynamic, irregular workloads on the GPU☆21Updated 9 years ago
- Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.☆70Updated 9 years ago
- Polyfill some holes in the SSE intrinsics set☆50Updated 2 years ago
- Set of guidelines for porting OpenCL™ C to OpenCL C++☆40Updated 7 years ago
- AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/Ope…☆59Updated this week
- ☆26Updated 6 years ago
- Kernel Tuning Toolkit☆59Updated last week
- Fast Fourier Transform using the Vulkan API☆32Updated 4 years ago
- A reference implementation of std::simd, providing data parallel types in the C++ standard☆12Updated 5 years ago
- Example of how to use CUDA with CMake >= 3.8☆69Updated last year
- A library to benchmark CUDA code, similar to google benchmark.☆28Updated 3 years ago
- The OpenCL Extension Wrangler Library☆82Updated 8 years ago
- CUDA Extension Wrangler☆24Updated 5 years ago
- SIMD optimizations related to 2D computer graphics☆34Updated 7 years ago
- mallocMC: Memory Allocator for Many Core Architectures☆55Updated last month
- Portable 128-bit SIMD intrinsics☆58Updated last year
- Synchronous, single-threaded, library-only SYCL implementation for debugging and verification.☆35Updated last month