NVIDIA / numba-cudaLinks
The CUDA target for Numba
☆193Updated this week
Alternatives and similar repositories for numba-cuda
Users that are interested in numba-cuda are comparing it to the libraries listed below
Sorting:
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆50Updated last week
- NVIDIA Math Libraries for the Python Ecosystem☆507Updated last month
- The Foundation for All Legate Libraries☆228Updated this week
- Data Parallel Extension for NumPy☆114Updated this week
- Data Parallel Extension for Numba☆84Updated last week
- KvikIO - High Performance File IO☆227Updated this week
- Python SYCL bindings and SYCL-based Python Array API library☆117Updated this week
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆64Updated 5 months ago
- ☆45Updated this week
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆302Updated last week
- ☆51Updated 4 months ago
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆454Updated last week
- LLM training in simple, raw C/CUDA☆105Updated last year
- NPBench - A Benchmarking Suite for High-Performance NumPy☆89Updated 4 months ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆356Updated this week
- POC work on MLIR backend☆60Updated last year
- ☆57Updated this week
- NVIDIA curated collection of educational resources related to general purpose GPU programming.☆729Updated last week
- Kernel Tuner☆366Updated this week
- GitHub Action to install CUDA☆188Updated last week
- RAPIDS Memory Manager☆623Updated last week
- Python bindings for UCX☆140Updated 2 weeks ago
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆62Updated last month
- MLPerf™ logging library☆37Updated last week
- ☆77Updated this week
- HIP Python Low-level Bindings☆30Updated 4 months ago
- High-Performance SGEMM on CUDA devices☆105Updated 8 months ago
- RFC document, tooling and other content related to the array API standard☆255Updated last month
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆303Updated last month
- AMD’s C++ library for accelerating tensor primitives☆46Updated last week