cjang / GATLASLinks
GPU Automatically Tuned Linear Algebra Software
☆28Updated 10 years ago
Alternatives and similar repositories for GATLAS
Users that are interested in GATLAS are comparing it to the libraries listed below
Sorting:
- A managed platform and language for GPGPU☆32Updated 12 years ago
- Portable 128-bit SIMD intrinsics☆59Updated 2 years ago
- This repository contains components that will support percolation via OpenCL and CUDA☆32Updated 3 years ago
- A portable high-level API with CUDA or OpenCL back-end☆55Updated 8 years ago
- Accelerator Programming Library in C++☆57Updated 7 years ago
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆77Updated 4 years ago
- Scientific library for high-precision computations and research☆49Updated 7 years ago
- clang with OpenMP 3.1 and some elements of OpenMP 4.0 support☆90Updated 10 years ago
- Flexible Library for Efficient Numerical Solutions☆127Updated 4 months ago
- Fast matrix multiplication☆31Updated 4 years ago
- C99/C++ header-only library for division via fixed-point multiplication by inverse☆58Updated last year
- Generalized Histograms for CUDA-capable GPUs☆43Updated 10 years ago
- Fork of magma to include more BLAS☆28Updated 8 years ago
- Research library for compile time optimization☆12Updated 6 years ago
- Programming Accelerators with C++ (PACXX)☆57Updated 7 years ago
- A lightweight C++ framework for vectorizing image-processing code☆76Updated 8 years ago
- A C/C++ task-based programming model for shared memory and distributed parallel computing.☆72Updated 5 years ago
- VIGRA2 based on xtensor☆10Updated 7 years ago
- Communication-Minimizing 2D Convolution in GPU Registers☆30Updated 12 years ago
- Sample implementation of a proposed C++ hashing framework☆29Updated 10 years ago
- A library for unconstrained minimization of smooth functions using Newton's method or L-BFGS.☆37Updated 7 years ago
- Asynchronous Task and Memory Interface, or ATMI, is a runtime framework and programming model for heterogeneous CPU-GPU systems. It provi…☆68Updated last year
- A Light-weight and Fast Template Matrix Library☆134Updated 12 years ago
- CMake Examples (CMake, CMake+CUDA, CMake+CUDA+PandaRoot)☆42Updated 12 years ago
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆36Updated 9 years ago
- ☆74Updated 2 years ago
- A domain-specific language and compiler for image processing☆77Updated 4 years ago
- Vector Math Library☆82Updated last month
- Computing Language Utility☆72Updated 9 years ago
- Intel(R) Concurrent Collections for C++☆116Updated 2 years ago