Gram21 / GPUSortingLinks
Implementation of a few sorting algorithms in OpenCL
☆36Updated 6 years ago
Alternatives and similar repositories for GPUSorting
Users that are interested in GPUSorting are comparing it to the libraries listed below
Sorting:
- Simple example of using Vulkan for GPGPU computing☆58Updated 7 years ago
- A portable high-level API with CUDA or OpenCL back-end☆55Updated 8 years ago
- ☆69Updated 3 years ago
- A machine vision library written in SYCL and C++ that shows performance-portable implementation of graph algorithms☆164Updated last year
- Implementation of the SYCL specification.☆66Updated last year
- Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.☆73Updated 10 years ago
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆78Updated 5 years ago
- Whippletree, a novel approach to scheduling dynamic, irregular workloads on the GPU☆22Updated 10 years ago
- Corrected source for the OpenCL in Action book (work in progress)☆63Updated 12 years ago
- Set of guidelines for porting OpenCL™ C to OpenCL C++☆41Updated 8 years ago
- Concurrent CPU-GPU Programming using Task Models☆106Updated 6 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆113Updated last year
- ☆74Updated 2 years ago
- Efficient CUDA Stream Compaction Library☆35Updated 2 years ago
- a software library containing Sparse functions written in OpenCL☆175Updated 5 years ago
- Parallel k-D Tree Construction☆57Updated 13 years ago
- Bitonic Sort for C and CUDA☆16Updated 7 years ago
- OpenCL specific C++ libraries implemented in C++ for OpenCL kernel language published in releases of OpenCL-Docs☆120Updated 2 years ago
- Portable 128-bit SIMD intrinsics☆59Updated 2 years ago
- Easy to run kernels using OpenCL☆187Updated 9 months ago
- Collection of samples and utilities for using ComputeCpp, Codeplay's SYCL implementation☆325Updated 2 years ago
- ☆124Updated 13 years ago
- Execution primitives for C++☆155Updated 5 years ago
- Full-speed Array of Structures access☆176Updated 2 years ago
- C++ convenience classes to be used with CUDA code, for both the host and the kerlel parts.☆55Updated 7 years ago
- Communication-Minimizing 2D Convolution in GPU Registers☆30Updated 12 years ago
- Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal, and Rust - all it takes to sum a lot of numbers fast!☆116Updated 6 months ago
- CMake Examples (CMake, CMake+CUDA, CMake+CUDA+PandaRoot)☆42Updated 12 years ago
- Visual Computing Library☆20Updated 3 weeks ago
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆38Updated 10 years ago