Jimver / cuda-toolkit
GitHub Action to install CUDA
☆170Updated last month
Alternatives and similar repositories for cuda-toolkit:
Users that are interested in cuda-toolkit are comparing it to the libraries listed below
- ☆58Updated 7 months ago
- NVIDIA Math Libraries for the Python Ecosystem☆276Updated last month
- An example combining scikit-build and pybind11☆125Updated last week
- A nanobind example project☆102Updated this week
- The CUDA target for Numba☆98Updated this week
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆53Updated last week
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆43Updated last week
- CUDA Kernel Benchmarking Library☆618Updated this week
- manylinux docker images with CUDA Toolkit☆12Updated 2 months ago
- A next generation Python CMake adaptor and Python API for plugins☆304Updated this week
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆369Updated last week
- AMD’s C++ library for accelerating tensor primitives☆39Updated this week
- A Visual Studio Code extension for building and debugging CUDA applications.☆80Updated 8 months ago
- FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme☆59Updated 3 weeks ago
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆533Updated last month
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆314Updated this week
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- Training material for Nsight developer tools☆156Updated 8 months ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆255Updated 2 weeks ago
- ☆534Updated this week
- LLM training in simple, raw C/CUDA☆92Updated 11 months ago
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆180Updated 4 months ago
- Generate stubs for python modules☆277Updated last month
- CUDA kernel author's tools☆111Updated 2 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆261Updated 3 months ago
- A user-friendly tool chain that enables the seamless execution of ONNX models using JAX as the backend.☆109Updated this week
- hipFFT is a FFT marshalling library.☆61Updated this week
- The Foundation for All Legate Libraries☆212Updated this week
- GPUOcelot: A dynamic compilation framework for PTX☆185Updated 2 months ago
- KvikIO - High Performance File IO☆203Updated this week