OpenMathLib / OpenBLAS
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
☆6,694Updated this week
Alternatives and similar repositories for OpenBLAS:
Users that are interested in OpenBLAS are comparing it to the libraries listed below
- oneAPI Deep Neural Network Library (oneDNN)☆3,772Updated this week
- ArrayFire: a general purpose GPU library.☆4,679Updated 2 weeks ago
- oneAPI Threading Building Blocks (oneTBB)☆6,075Updated this week
- a language for fast, portable data-parallel computation☆6,029Updated this week
- C++ tensors with broadcasting and lazy computing☆3,498Updated last week
- BLAS-like Library Instantiation Software Framework☆2,412Updated 2 weeks ago
- [ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl☆4,960Updated last year
- mlpack: a fast, header-only C++ machine learning library☆5,322Updated last week
- header only, dependency-free deep learning framework in C++14☆5,912Updated 3 years ago
- C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))☆2,353Updated this week
- ☆1,863Updated last year
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,745Updated last year
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆12,227Updated this week
- Performance-portable, length-agnostic SIMD with runtime dispatch☆4,552Updated this week
- DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)☆2,849Updated 2 months ago
- Optimized primitives for collective multi-GPU communication☆3,659Updated last week
- C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, NEON for ARM.☆2,140Updated this week
- Compiler for Neural Network hardware accelerators☆3,279Updated 11 months ago
- Open MPI main development repository☆2,325Updated this week
- Tuned OpenCL BLAS☆1,097Updated this week
- CUDA Templates for Linear Algebra Subroutines☆7,326Updated this week
- A C++ GPU Computing Library for OpenCL☆1,598Updated last week
- CUDA integration for Python, plus shiny features☆1,923Updated 2 months ago
- The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologi…☆2,951Updated this week
- High-efficiency floating-point neural network inference operators for mobile, server, and Web☆2,001Updated this week
- Seamless operability between C++11 and Python☆16,492Updated this week
- A microbenchmark support library☆9,413Updated this week
- HIP: C++ Heterogeneous-Compute Interface for Portability☆3,977Updated this week
- Acceleration package for neural networks on multi-core CPUs☆1,686Updated 10 months ago
- Source code examples from the Parallel Forall Blog☆1,279Updated 8 months ago