cjang / GATLASLinks
GPU Automatically Tuned Linear Algebra Software
☆28Updated 9 years ago
Alternatives and similar repositories for GATLAS
Users that are interested in GATLAS are comparing it to the libraries listed below
Sorting:
- A managed platform and language for GPGPU☆32Updated 12 years ago
- This repository contains components that will support percolation via OpenCL and CUDA☆32Updated 3 years ago
- Accelerator Programming Library in C++☆57Updated 7 years ago
- Scientific library for high-precision computations and research☆49Updated 7 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆77Updated 4 years ago
- Programming Accelerators with C++ (PACXX)☆57Updated 7 years ago
- Fork of magma to include more BLAS☆28Updated 8 years ago
- Portable 128-bit SIMD intrinsics☆58Updated 2 years ago
- Research library for compile time optimization☆12Updated 6 years ago
- Enable Polyhedral JIT compilation☆9Updated 6 years ago
- Library to program with streams, events, and to queue own functions into a stream.☆15Updated last year
- CMake Examples (CMake, CMake+CUDA, CMake+CUDA+PandaRoot)☆41Updated 11 years ago
- A library for unconstrained minimization of smooth functions using Newton's method or L-BFGS.☆36Updated 6 years ago
- Flexible Library for Efficient Numerical Solutions☆127Updated last month
- Computing Language Utility☆72Updated 9 years ago
- A C/C++ task-based programming model for shared memory and distributed parallel computing.☆71Updated 4 years ago
- Vector Math Library☆78Updated this week
- A Light-weight and Fast Template Matrix Library☆133Updated 12 years ago
- clang with OpenMP 3.1 and some elements of OpenMP 4.0 support☆91Updated 10 years ago
- C++ Summer Lecture Series 2016☆13Updated 8 years ago
- A lightweight C++ framework for vectorizing image-processing code☆76Updated 8 years ago
- ViNN - an OpenCL accelerated neural networks library☆33Updated 9 years ago
- TTC: A high-performance Compiler for Tensor Transpositions☆20Updated 7 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Updated 8 years ago
- A GPU-based LZSS compression algorithm, highly tuned for NVIDIA GPGPUs and for streaming data, leveraging the respective strengths of CPU…☆35Updated 9 years ago
- Simple C++/Qt Project for Searching in text files using BST / TST / Trie / Hash Data Structures.☆11Updated 8 years ago
- UME::SIMD A library for explicit simd vectorization.☆91Updated 7 years ago
- A library of tools for compiler construction.☆11Updated 9 years ago
- stage the upgrade of hcc-clang to clang ToT☆11Updated 5 years ago