IntelPython / numba-dpex
Data Parallel Extension for Numba
☆81Updated 5 months ago
Alternatives and similar repositories for numba-dpex:
Users that are interested in numba-dpex are comparing it to the libraries listed below
- Data Parallel Extension for NumPy☆108Updated this week
- Python SYCL bindings and SYCL-based Python Array API library☆110Updated this week
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆44Updated this week
- The CUDA target for Numba☆112Updated this week
- Benchmark suite to evaluate Data Parallel Extensions for Python☆17Updated 8 months ago
- NPBench - A Benchmarking Suite for High-Performance NumPy☆81Updated 2 weeks ago
- Deploy Dask using MPI4Py☆55Updated last month
- The Foundation for All Legate Libraries☆216Updated this week
- OpenMP for Python in Numba☆104Updated last week
- Experimental plugin for scikit-learn to be able to run (some estimators) on Intel GPUs via numba-dpex.☆16Updated last year
- Legate Sparse is a Legate library that aims to provide a distributed and accelerated drop-in replacement for the scipy.sparse library on …☆20Updated this week
- Sparse 3D FFT library with MPI, OpenMP, CUDA and ROCm support☆53Updated 2 months ago
- Analyze graph/hierarchical performance data using pandas dataframes☆114Updated 3 months ago
- An Aspiring Drop-In Replacement for Pandas at Scale☆75Updated 3 years ago
- oneAPI Technical Advisory Board (TAB) Meeting Notes☆72Updated last year
- Python combination of Ray and Numba providing compiled distributed arrays, remote functions, and actors.☆32Updated last year
- Standalone Spack Tutorial Repository☆51Updated last month
- Next generation LAPACK implementation for ROCm platform☆100Updated this week
- Python bindings for OpenSHMEM☆16Updated 2 weeks ago
- Training examples for SYCL☆42Updated last week
- POC work on MLIR backend☆55Updated 8 months ago
- ☆127Updated this week
- A hands-on introduction to tuning GPU kernels using Kernel Tuner https://github.com/KernelTuner/kernel_tuner/☆30Updated last month
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆205Updated this week
- Advanced Profiling and Analytics for AMD Hardware☆154Updated this week
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆36Updated last month
- Next generation library for iterative sparse solvers for ROCm platform☆81Updated last week
- RAJA Performance Suite☆117Updated this week
- A unified framework across multiple programming platforms☆37Updated 10 months ago
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆54Updated last week