roostaiyan / CudaSharedPtrLinks
Shared Pointer for Cuda Device Pointers and Cuda Streams, Smart Wrapper to Allocate and Deallocate Cuda Device Buffer.
☆0Updated 2 years ago
Alternatives and similar repositories for CudaSharedPtr
Users that are interested in CudaSharedPtr are comparing it to the libraries listed below
Sorting:
- Source code examples from the Parallel Forall Blog☆96Updated 6 years ago
- Some CUDA design patterns and a bit of template magic for CUDA☆154Updated 2 years ago
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆89Updated last year
- CUDA kernel author's tools☆111Updated 3 years ago
- a CUDA implementation of a priority queue☆84Updated 4 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- ☆58Updated 9 months ago
- Fast and full-featured Matrix Market I/O library for C++, Python, and R☆79Updated 10 months ago
- MWE for using the Eigen library in CUDA kernels☆119Updated 2 years ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- BGHT: High-performance static GPU hash tables.☆66Updated 2 months ago
- An expression template based linear algebra library running completely on the GPU using CUDA☆25Updated 4 years ago
- Tutorial for wrapping C++ library into Python using pybind11 and CMake☆146Updated last year
- Autonomic Performance Environment for eXascale (APEX)☆48Updated last month
- SuiteSparse: a suite of sparse matrix packages by @DrTimothyAldenDavis et al. with native CMake support☆53Updated last week
- Introductory Thrust workshop materials☆43Updated 12 years ago
- Examples for using SYCL on CUDA☆62Updated last week
- Easily display progress in C++17. Inspired by python's awesome tqdm library.☆67Updated last year
- ☆19Updated 5 years ago
- Numbast is a tool to build an automated pipeline that converts CUDA APIs into Numba bindings.☆47Updated last week
- A warp-oriented dynamic hash table for GPUs☆73Updated last year
- Subset of BLAS routines optimized for NVIDIA GPUs☆69Updated 2 years ago
- ☆246Updated 2 months ago
- A library to benchmark CUDA code, similar to google benchmark.☆29Updated 4 years ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆84Updated last year
- DLA-Future☆75Updated last month
- ☆543Updated this week
- Abstractions of memory, allocator, vector, tuple, shared_ptr, unique_ptr, bitset, variant and string working on both CPU and GPU☆30Updated 2 months ago
- A C++17 interface for HDF5☆96Updated 2 months ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆55Updated 3 months ago