NVIDIA / numba-cudaLinks
The CUDA target for Numba
☆239Updated this week
Alternatives and similar repositories for numba-cuda
Users that are interested in numba-cuda are comparing it to the libraries listed below
Sorting:
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆55Updated last week
- NVIDIA Math Libraries for the Python Ecosystem☆541Updated last month
- The Foundation for All Legate Libraries☆233Updated 2 weeks ago
- Data Parallel Extension for NumPy☆119Updated last week
- Data Parallel Extension for Numba☆88Updated 3 months ago
- KvikIO - High Performance File IO☆235Updated last week
- Python SYCL bindings and SYCL-based Python Array API library☆119Updated this week
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆323Updated this week
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆69Updated 8 months ago
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆499Updated this week
- ☆53Updated last week
- NPBench - A Benchmarking Suite for High-Performance NumPy☆91Updated last month
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆368Updated this week
- ☆97Updated 3 weeks ago
- OpenMP for Python in Numba☆151Updated 2 months ago
- ☆53Updated last month
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆65Updated 2 months ago
- RAPIDS Memory Manager☆670Updated this week
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆342Updated last month
- GitHub Action to install CUDA☆197Updated 2 weeks ago
- AMD’s C++ library for accelerating tensor primitives☆47Updated 3 weeks ago
- LLM training in simple, raw C/CUDA☆109Updated last year
- HIP Python Low-level Bindings☆33Updated 2 months ago
- Kernel Tuner☆378Updated 3 weeks ago
- Collection of scripts to build PyTorch and the domain libraries from source.☆13Updated 2 months ago
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆212Updated last month
- Legate Sparse is a Legate library that aims to provide a distributed and accelerated drop-in replacement for the scipy.sparse library on …☆24Updated 2 weeks ago
- POC work on MLIR backend☆61Updated last year
- ☆71Updated this week
- Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Lar…☆91Updated this week