Jimver / cuda-toolkitLinks
GitHub Action to install CUDA
☆199Updated last month
Alternatives and similar repositories for cuda-toolkit
Users that are interested in cuda-toolkit are comparing it to the libraries listed below
Sorting:
- The CUDA target for Numba☆251Updated this week
- ☆59Updated 4 months ago
- An example combining scikit-build and pybind11☆141Updated 2 weeks ago
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆507Updated last week
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆57Updated last week
- A nanobind example project☆119Updated last week
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆329Updated this week
- Data Parallel Extension for NumPy☆121Updated last week
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆70Updated 9 months ago
- A next generation Python CMake adaptor and Python API for plugins☆439Updated this week
- Python SYCL bindings and SYCL-based Python Array API library☆121Updated this week
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆380Updated last week
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆569Updated 4 months ago
- The Foundation for All Legate Libraries☆235Updated this week
- A High-Throughput Parallel Lossless Compressor for Scientific Data☆75Updated 3 years ago
- ☆618Updated this week
- CUDA Kernel Benchmarking Library☆809Updated last week
- ☆281Updated this week
- CUDA kernel author's tools☆116Updated 3 years ago
- LLM training in simple, raw C/CUDA☆112Updated last year
- manylinux docker images with CUDA Toolkit☆19Updated 2 months ago
- C++ library for reading and writing of numpy's .npy files☆424Updated last year
- Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloade…☆613Updated last year
- Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal, and Rust - all it takes to sum a lot of numbers fast!☆116Updated 6 months ago
- RAPIDS Memory Manager☆681Updated this week
- HIPIFY: Convert CUDA to Portable C++ Code☆657Updated this week
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- ☆44Updated this week
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆260Updated last year
- GPUOcelot: A dynamic compilation framework for PTX☆219Updated last year