NVIDIA / numba-cudaLinks
The CUDA target for Numba
☆138Updated this week
Alternatives and similar repositories for numba-cuda
Users that are interested in numba-cuda are comparing it to the libraries listed below
Sorting:
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆47Updated last week
- NVIDIA Math Libraries for the Python Ecosystem☆329Updated 2 weeks ago
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆61Updated 2 months ago
- Data Parallel Extension for Numba☆81Updated 7 months ago
- KvikIO - High Performance File IO☆213Updated last week
- The Foundation for All Legate Libraries☆218Updated this week
- Data Parallel Extension for NumPy☆109Updated this week
- NPBench - A Benchmarking Suite for High-Performance NumPy☆81Updated last month
- Python SYCL bindings and SYCL-based Python Array API library☆113Updated this week
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆56Updated last month
- POC work on MLIR backend☆55Updated 10 months ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆337Updated this week
- Analyze graph/hierarchical performance data using pandas dataframes☆115Updated 4 months ago
- RFC document, tooling and other content related to the array API standard☆241Updated last week
- Python bindings for UCX☆135Updated last week
- An Aspiring Drop-In Replacement for Pandas at Scale☆73Updated 3 years ago
- ☆35Updated this week
- ☆31Updated last week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆119Updated this week
- ☆60Updated last month
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆273Updated 3 weeks ago
- Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Lar…☆50Updated this week
- Advanced Profiling and Analytics for AMD Hardware☆156Updated this week
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆396Updated 3 weeks ago
- Samples demonstrating how to use the Compute Sanitizer Tools and Public API☆83Updated last year
- LLM training in simple, raw C/CUDA☆99Updated last year
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆274Updated last week
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- A hands-on introduction to tuning GPU kernels using Kernel Tuner https://github.com/KernelTuner/kernel_tuner/☆31Updated 2 months ago
- Deploy Dask using MPI4Py☆54Updated 2 months ago