gcdart / dense-matrix-multLinks
☆21Updated 12 years ago
Alternatives and similar repositories for dense-matrix-mult
Users that are interested in dense-matrix-mult are comparing it to the libraries listed below
Sorting:
- Vector Math Library☆80Updated last month
- A Light-weight and Fast Template Matrix Library☆134Updated 12 years ago
- Flexible Library for Efficient Numerical Solutions☆127Updated 2 months ago
- NumPy-compatible multidimensional arrays in C++☆161Updated 10 months ago
- C++ library for numerical arrays and tensor objects and operations with them, designed to allow Matlab-style programming.☆52Updated 2 years ago
- GPU Automatically Tuned Linear Algebra Software☆28Updated 10 years ago
- CMake module collection☆30Updated 10 years ago
- Portable 128-bit SIMD intrinsics☆59Updated 2 years ago
- String splitting benchmarks☆39Updated 9 years ago
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- Generalized Histograms for CUDA-capable GPUs☆42Updated 10 years ago
- The x template library☆224Updated 5 months ago
- C++ Lightweight Utility Extensions☆73Updated 3 years ago
- UME::SIMD A library for explicit simd vectorization.☆91Updated 7 years ago
- clang with OpenMP 3.1 and some elements of OpenMP 4.0 support☆91Updated 10 years ago
- Fast matrix multiplication☆29Updated 4 years ago
- CMake Examples (CMake, CMake+CUDA, CMake+CUDA+PandaRoot)☆42Updated 12 years ago
- C99/C++ header-only library for division via fixed-point multiplication by inverse☆55Updated last year
- 3D Tensors for Blaze (https://bitbucket.org/blaze-lib/blaze)☆37Updated 4 years ago
- Multi-dimensional C++ arrays which store objects in a Struct-of-Arrays (SoA) memory layout for efficient vectorization and zero address g…☆74Updated 4 years ago
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 5 years ago
- Blazing-fast Expression Templates Library (ETL) with GPU support, in C++☆228Updated 3 months ago
- Multi-dimensional C++ arrays which store objects in a Struct-of-Arrays (SoA) memory layout for efficient vectorization and zero address g…☆36Updated 4 years ago
- Boost.uBlas☆116Updated 2 weeks ago
- C++ implementation of concurrent Binary Search Trees☆72Updated 10 years ago
- fast log and exp functions for AVX2/AVX-512☆233Updated 5 months ago
- Launching collective tasks in bulk☆37Updated 5 years ago
- Mirror kept for legacy. Moved to https://github.com/llvm/llvm-project☆172Updated 5 years ago
- A demonstration of speeding up a 1D convolution using SSE☆51Updated 8 years ago
- Generic SIMD intrinsic to allow for portable SIMD intrinsic programming☆41Updated 11 years ago