SC-SGS / Distributed_GPU_LSH_using_SYCL
Distributed k-nearest Neighbors using Locality Sensitive Hashing and SYCL
☆10Updated 3 years ago
Alternatives and similar repositories for Distributed_GPU_LSH_using_SYCL:
Users that are interested in Distributed_GPU_LSH_using_SYCL are comparing it to the libraries listed below
- Fast and full-featured Matrix Market I/O library for C++, Python, and R☆78Updated 9 months ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆44Updated this week
- ☆29Updated 2 weeks ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆50Updated last month
- BGHT: High-performance static GPU hash tables.☆63Updated last month
- BLAS implementation for Intel FPGA☆78Updated 4 years ago
- A warp-oriented dynamic hash table for GPUs☆73Updated last year
- Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceler…☆29Updated 10 months ago
- A unified framework across multiple programming platforms☆37Updated 10 months ago
- Data Parallel Extension for NumPy☆108Updated this week
- Reference implementation of the draft C++ GraphBLAS specification.☆32Updated 2 months ago
- 🎃 GPU load-balancing library for regular and irregular computations.☆62Updated 10 months ago
- SuiteSparse: a suite of sparse matrix packages by @DrTimothyAldenDavis et al. with native CMake support☆53Updated 9 months ago
- Cooperative Primitives for CUDA C++ Kernel Authors. This repository contains CUB PRs from Q4 2019 until Q4 2020.☆22Updated 4 years ago
- Data Parallel Extension for Numba☆81Updated 5 months ago
- TopK Algorithms Benchmark☆10Updated 5 years ago
- Python SYCL bindings and SYCL-based Python Array API library☆110Updated this week
- MagmaDNN: a simple deep learning framework in c++☆49Updated 4 years ago
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆72Updated last month
- fast Fourier transform on GPU in shared memory for AstroAccelerate project☆26Updated 4 years ago
- Dynamic matrix type and algorithms for sparse matrices☆19Updated 2 months ago
- Repository holding the code base to AC-SpGEMM : "Adaptive Sparse Matrix-Matrix Multiplication on the GPU"☆28Updated 4 years ago
- Directed Acyclic Graph Execution Engine (DAGEE) is a C++ library that enables programmers to express computation and data movement, as ta…☆47Updated 3 years ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- Samples demonstrating how to use the Compute Sanitizer Tools and Public API☆81Updated last year
- The Hybrid Task Graph Scheduler API☆40Updated this week
- NPBench - A Benchmarking Suite for High-Performance NumPy☆81Updated 2 weeks ago
- Efficient SpGEMM on GPU using CUDA and CSR☆54Updated last year