upsj / gpu_selectionLinks
Parallel selection on GPUs
☆16Updated 4 years ago
Alternatives and similar repositories for gpu_selection
Users that are interested in gpu_selection are comparing it to the libraries listed below
Sorting:
- ☆248Updated last month
- CUDA implementation of the fundamental sum reduce operation. Aims to be as optimized as reasonable.☆37Updated 8 years ago
- ☆37Updated this week
- ☆16Updated 2 years ago
- Full-speed Array of Structures access☆171Updated 2 years ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆106Updated 7 years ago
- High-performance, GPU-aware communication library☆86Updated 6 months ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆71Updated 2 years ago
- Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben A…☆275Updated 3 months ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- A GPU accelerated error-bounded lossy compression for scientific data.☆86Updated last month
- A library to benchmark CUDA code, similar to google benchmark.☆29Updated 4 years ago
- ☆45Updated 4 years ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆84Updated last year
- ☆554Updated last week
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆261Updated 6 months ago
- STREAM, for lots of devices written in many programming models☆345Updated 10 months ago
- CUDA kernel author's tools☆111Updated 3 years ago
- 🎃 GPU load-balancing library for regular and irregular computations.☆62Updated last year
- ☆93Updated 8 years ago
- A Library for fast Hash Tables on GPUs☆125Updated 3 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- Advanced Profiling and Analytics for AMD Hardware☆159Updated this week
- ☆62Updated 7 months ago
- Kernel Tuner☆353Updated last week
- oneAPI Collective Communications Library (oneCCL)☆238Updated last week
- portDNN is a library implementing neural network algorithms written using SYCL☆113Updated last year
- Online CUDA Occupancy Calculator☆79Updated 3 years ago
- MPI accelerator-integrated communication extensions☆36Updated 2 years ago
- Efficient SpGEMM on GPU using CUDA and CSR☆56Updated 2 years ago