NVIDIA / numba-cudaLinks
The CUDA target for Numba
☆248Updated this week
Alternatives and similar repositories for numba-cuda
Users that are interested in numba-cuda are comparing it to the libraries listed below
Sorting:
- NVIDIA Math Libraries for the Python Ecosystem☆543Updated 2 weeks ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆56Updated this week
- The Foundation for All Legate Libraries☆233Updated last week
- Data Parallel Extension for Numba☆89Updated 4 months ago
- Data Parallel Extension for NumPy☆119Updated this week
- ☆53Updated last week
- KvikIO - High Performance File IO☆238Updated this week
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆506Updated last week
- Python SYCL bindings and SYCL-based Python Array API library☆121Updated this week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆375Updated this week
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆327Updated 3 weeks ago
- NPBench - A Benchmarking Suite for High-Performance NumPy☆91Updated last week
- ☆55Updated 2 months ago
- Kernel Tuner☆381Updated this week
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆67Updated 2 weeks ago
- ☆101Updated this week
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆348Updated last month
- LLM training in simple, raw C/CUDA☆112Updated last year
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆70Updated 9 months ago
- Python bindings for UCX☆139Updated 4 months ago
- Fast and Furious AMD Kernels☆346Updated last week
- HIP Python Low-level Bindings☆33Updated 2 months ago
- We aim to redefine Data Parallel libraries portabiliy, performance, programability and maintainability, by using C++ standard features, i…☆46Updated this week
- ☆74Updated this week
- Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Lar…☆94Updated 3 weeks ago
- RAPIDS Memory Manager☆681Updated this week
- AMD’s C++ library for accelerating tensor primitives☆48Updated last week
- Bandwidth test for ROCm☆73Updated last week
- GitHub Action to install CUDA☆199Updated last month
- CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-base…☆804Updated 2 weeks ago