lumianph / gpuprecLinks
gpuprec: Extended-Precision Libraries on GPUs
☆37Updated 9 years ago
Alternatives and similar repositories for gpuprec
Users that are interested in gpuprec are comparing it to the libraries listed below
Sorting:
- Kernel Tuning Toolkit☆59Updated 3 weeks ago
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆46Updated 10 years ago
- This tool serves as a test harness for different optimization techniques to improve stencil computations performance in shared and distri…☆20Updated 2 years ago
- sparse matrix pre-processing library☆82Updated last year
- a software library containing Sparse functions written in OpenCL☆175Updated 5 years ago
- mallocMC: Memory Allocator for Many Core Architectures☆55Updated 3 weeks ago
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆35Updated last month
- Next generation library for iterative sparse solvers for ROCm platform☆81Updated this week
- Launching collective tasks in bulk☆37Updated 5 years ago
- YASK--Yet Another Stencil Kit: a domain-specific language and framework to create high-performance stencil code for implementing finite-d…☆107Updated 10 months ago
- Use CUDA intrinsics with user-defined types☆47Updated 10 years ago
- GPU Code optimizer for stencil computations. Refer to our IPDPS'19 paper for more details☆24Updated 5 years ago
- Mandelbrot fractal on NVidia GPUs using CUDA dynamic parallelism and Mariani-Silver algorithm☆29Updated 11 years ago
- HIP back-end for Thrust that has been replaced by rocThrust☆28Updated 2 years ago
- A task benchmark☆42Updated 10 months ago
- CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.☆59Updated 2 years ago
- List all available information about all SYCL devices and platforms☆15Updated 4 years ago
- Codeplay project for contributions to the LLVM SYCL implementation☆30Updated 4 years ago
- The Combinatorial BLAS (CombBLAS) is an extensible distributed-memory parallel graph library offering a small but powerful set of linear …☆75Updated this week
- ulmBLAS☆106Updated 3 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- Implementation of AMD HIP for CPUs☆22Updated 4 years ago
- Compute applications.☆24Updated 5 years ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆68Updated 2 years ago
- Tensor Contraction Code Generator☆37Updated 7 years ago
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆108Updated 2 years ago
- A unified framework across multiple programming platforms☆38Updated this week
- data-parallel out-of-core library☆50Updated this week
- An OpenMP runtime implemented using HPX☆24Updated 2 years ago
- A library for C++/Fortran computer simulations (e.g. stencil codes, mesh-free, unstructured grids, n-body & particle methods). Scales fro…☆40Updated 4 years ago