SC-SGS / Distributed_GPU_LSH_using_SYCLLinks
Distributed k-nearest Neighbors using Locality Sensitive Hashing and SYCL
☆10Updated 4 years ago
Alternatives and similar repositories for Distributed_GPU_LSH_using_SYCL
Users that are interested in Distributed_GPU_LSH_using_SYCL are comparing it to the libraries listed below
Sorting:
- Subset of BLAS routines optimized for NVIDIA GPUs☆76Updated 2 years ago
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆81Updated 5 months ago
- A fast shared & distributed memory task-based runtime in C++☆28Updated 4 years ago
- The Foundation for All Legate Libraries☆233Updated this week
- ☆35Updated last week
- Fast and full-featured Matrix Market I/O library for C++, Python, and R☆86Updated last year
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆212Updated this week
- A unified framework across multiple programming platforms☆43Updated 8 months ago
- Some CUDA design patterns and a bit of template magic for CUDA☆158Updated 2 years ago
- Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal, and Rust - all it takes to sum a lot of numbers fast!☆116Updated 6 months ago
- A warp-oriented dynamic hash table for GPUs☆76Updated 2 years ago
- Distributed Communication-Optimal LU-factorization Algorithm☆12Updated 4 years ago
- CUDA kernel author's tools☆115Updated 3 years ago
- CUDA implementation of the fundamental sum reduce operation. Aims to be as optimized as reasonable.☆39Updated 8 years ago
- Parallel selection on GPUs☆15Updated 4 years ago
- Data Parallel Extension for NumPy☆121Updated this week
- Analyze graph/hierarchical performance data using pandas dataframes☆118Updated 3 months ago
- GGNN: State of the Art Graph-based GPU Nearest Neighbor Search☆169Updated 11 months ago
- A Library for fast Hash Tables on GPUs☆132Updated 3 months ago
- Directed Acyclic Graph Execution Engine (DAGEE) is a C++ library that enables programmers to express computation and data movement, as ta…☆47Updated 4 years ago
- Kernel Tuning Toolkit☆67Updated last week
- A GPU accelerated error-bounded lossy compression for scientific data.☆94Updated 3 weeks ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆85Updated last year
- Template for GPU accelerated python libraries☆51Updated 2 years ago
- Python SYCL bindings and SYCL-based Python Array API library☆121Updated last week
- Home of ALP/GraphBLAS and ALP/Pregel, featuring shared- and distributed-memory auto-parallelisation of linear algebraic and vertex-centri…☆33Updated last week
- FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme☆111Updated 2 months ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆260Updated last year
- NPBench - A Benchmarking Suite for High-Performance NumPy☆91Updated last week
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆56Updated this week