Jimver / cuda-toolkit
GitHub Action to install CUDA
☆166Updated last month
Alternatives and similar repositories for cuda-toolkit:
Users that are interested in cuda-toolkit are comparing it to the libraries listed below
- ☆58Updated 6 months ago
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆350Updated 2 weeks ago
- CUDA Kernel Benchmarking Library☆582Updated 3 months ago
- An example combining scikit-build and pybind11☆121Updated this week
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆36Updated last week
- A nanobind example project☆101Updated 2 weeks ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆253Updated this week
- NVIDIA Math Libraries for the Python Ecosystem☆237Updated 2 months ago
- A next generation Python CMake adaptor and Python API for plugins☆281Updated this week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆310Updated this week
- KvikIO - High Performance File IO☆191Updated this week
- ☆249Updated this week
- Training material for Nsight developer tools☆149Updated 6 months ago
- The CUDA target for Numba☆65Updated this week
- Generate stubs for python modules☆269Updated last week
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆179Updated 2 months ago
- CUDA kernel author's tools☆110Updated 2 years ago
- The Foundation for All Legate Libraries☆204Updated 2 weeks ago
- An extension library of WMMA API (Tensor Core API)☆90Updated 7 months ago
- ☆516Updated this week
- Python SYCL bindings and SYCL-based Python Array API library☆110Updated this week
- HIPIFY: Convert CUDA to Portable C++ Code☆556Updated this week
- GPUOcelot: A dynamic compilation framework for PTX☆175Updated 3 weeks ago
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆523Updated 2 weeks ago
- ☆70Updated last month
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- AMD SMI☆55Updated this week
- ROCm BLAS marshalling library☆133Updated this week
- Example to build PyTorch CUDA extension using CMake (with pybind11 and scikit-build)☆11Updated 4 years ago
- Pybind11 tool for making docstrings from C++ comments☆40Updated 10 months ago