intel / HFAV
☆14Updated 2 years ago
Alternatives and similar repositories for HFAV
Users that are interested in HFAV are comparing it to the libraries listed below
Sorting:
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- GPU Optimization and Memory Abstraction Framework☆32Updated 5 years ago
- sparse matrix pre-processing library☆82Updated last year
- A library to benchmark CUDA code, similar to google benchmark.☆28Updated 4 years ago
- MPI wrapper generator, for writing PMPI tool libraries☆34Updated last month
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆46Updated 10 years ago
- Compute applications.☆24Updated 5 years ago
- Program Generator for Small-Scale Linear Algebra Applications☆29Updated 6 years ago
- General Stride K-Nearest Neighbors☆13Updated 3 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- Autonomic Performance Environment for eXascale (APEX)☆47Updated 2 weeks ago
- An implementation of ARMCI using MPI one-sided communication (RMA)☆14Updated 7 months ago
- A Sound and Complete Verification Tool for Warp-Specialized GPU Kernels☆19Updated 9 years ago
- Range-based for loops to iterate over a range of numbers or values☆35Updated 8 years ago
- A task benchmark☆42Updated 9 months ago
- Multi-dimensional array programming framework for C++ and multi-GPU CUDA applications☆28Updated 8 years ago
- Recursive LAPACK Collection☆42Updated 3 years ago
- Par4All is an automatic parallelizing and optimizing compiler (workbench) for C and Fortran sequential programs☆52Updated 9 years ago
- Loop Kernel Analysis and Performance Modeling Toolkit☆93Updated last month
- CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as lo…☆30Updated 2 weeks ago
- An OpenMP runtime implemented using HPX☆24Updated 2 years ago
- ☆29Updated this week
- Orio is an open-source extensible framework for the definition of domain-specific languages and generation of optimized code for multiple…☆37Updated 3 years ago
- Checks to verify the usage of the MPI API in C and C++ code, based on Clang’s Static Analyzer and Clang-Tidy.☆38Updated 8 months ago
- OpenMP Offloading Validation & Verification Suite; Official repository. We have migrated from bitbucket!! For documentation, results, pub…☆58Updated last week
- Generalized Histograms for CUDA-capable GPUs☆42Updated 9 years ago
- CNNs in Halide☆23Updated 9 years ago
- Fork of magma to include more BLAS☆28Updated 8 years ago
- Archer, a data race detection tool for large OpenMP applications☆63Updated 4 years ago
- associative floating point addition☆18Updated last year