curtisseizert / CUDA-uint128
A 128 bit unsigned integer class for CUDA
☆43Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for CUDA-uint128
- CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups☆206Updated last month
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆99Updated 7 years ago
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆107Updated last year
- A library to benchmark CUDA code, similar to google benchmark.☆28Updated 3 years ago
- ROCm - AMDGPU Compute Application Binary Interface☆40Updated 2 years ago
- SYCL Benchmark Suite☆56Updated 2 months ago
- ☆50Updated 4 years ago
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆99Updated this week
- ☆53Updated last week
- portDNN is a library implementing neural network algorithms written using SYCL☆108Updated 5 months ago
- Kernel Tuning Toolkit☆55Updated last week
- CUDA accelerated(X) Multi-Precision library☆87Updated 8 years ago
- Advanced Profiling and Analytics for AMD Hardware☆135Updated this week
- SYCL Open Source Specification☆114Updated this week
- High-level C++ for Accelerator Clusters☆142Updated this week
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆42Updated 10 months ago
- Full-speed Array of Structures access☆160Updated last year
- SYCL Conformance Tests☆62Updated last week
- Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!☆73Updated 6 months ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆77Updated 5 years ago
- Short examples illustrating AVX2 intrinsics for simple tasks.☆82Updated 7 months ago
- ☆16Updated 3 years ago
- CUDA kernel author's tools☆107Updated 2 years ago
- ROCm Parallel Primitives☆161Updated this week
- TLB Benchmarks☆32Updated 7 years ago
- An implementation of HIP that works on CPUs, across OSes.☆112Updated 7 months ago
- Next generation LAPACK implementation for ROCm platform☆93Updated this week
- Next generation SPARSE implementation for ROCm platform☆116Updated this week
- ☆215Updated this week
- Learn OpenCL step by step.☆131Updated 2 years ago