intel / HFAVLinks
☆14Updated 2 years ago
Alternatives and similar repositories for HFAV
Users that are interested in HFAV are comparing it to the libraries listed below
Sorting:
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- Range-based for loops to iterate over a range of numbers or values☆35Updated 8 years ago
- sparse matrix pre-processing library☆82Updated last year
- GPU Optimization and Memory Abstraction Framework☆32Updated 5 years ago
- MPI wrapper generator, for writing PMPI tool libraries☆35Updated 3 months ago
- Full-speed Array of Structures access☆171Updated 2 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆69Updated 2 years ago
- Recursive LAPACK Collection☆42Updated 3 years ago
- Fast matrix multiplication☆29Updated 3 years ago
- Cooperative Primitives for CUDA C++ Kernel Authors. This repository contains CUB PRs from Q4 2019 until Q4 2020.☆22Updated 4 years ago
- A library to benchmark CUDA code, similar to google benchmark.☆29Updated 4 years ago
- Checks to verify the usage of the MPI API in C and C++ code, based on Clang’s Static Analyzer and Clang-Tidy.☆38Updated 10 months ago
- Kernel Tuning Toolkit☆60Updated last month
- CNNs in Halide☆23Updated 9 years ago
- A unified framework across multiple programming platforms☆41Updated 3 weeks ago
- ☆29Updated 2 weeks ago
- Multi-dimensional array programming framework for C++ and multi-GPU CUDA applications☆28Updated 8 years ago
- OpenMP Offloading Validation & Verification Suite; Official repository. We have migrated from bitbucket!! For documentation, results, pub…☆58Updated 2 weeks ago
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆77Updated 4 years ago
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆46Updated 10 years ago
- Multiple 1-stencil implementations using nvidia cuda.☆13Updated 7 years ago
- A library for C++/Fortran computer simulations (e.g. stencil codes, mesh-free, unstructured grids, n-body & particle methods). Scales fro…☆40Updated 4 years ago
- Absinthe is an optimization framework to fuse and tile stencil codes in one shot☆14Updated 5 years ago
- Scientific library for high-precision computations and research☆49Updated 7 years ago
- Experimental Linear Algebra Performance Studies☆12Updated 8 years ago
- Implementation of AMD HIP for CPUs☆22Updated 5 years ago
- Autonomic Performance Environment for eXascale (APEX)☆48Updated last month
- data-parallel out-of-core library☆50Updated last week
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago