aff3ct / MIPP
Portable wrapper for SIMD and vector instructions written in C++11. Compatible with NEON, SSE, AVX, AVX-512 and SVE (length specific).
☆499Updated last month
Alternatives and similar repositories for MIPP:
Users that are interested in MIPP are comparing it to the libraries listed below
- Agenium Scale vectorization library for CPUs and GPUs☆332Updated 3 years ago
- std::simd for GCC [ISO/IEC TS 19570:2018]☆607Updated 2 years ago
- Portable header-only C++ low level SIMD library☆1,270Updated 7 months ago
- stdgpu: Efficient STL-like Data Structures on the GPU☆1,212Updated 2 months ago
- A lightweight high performance tensor algebra framework for modern C++☆779Updated last year
- C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))☆2,347Updated this week
- SIMD Vector Classes for C++☆1,479Updated 10 months ago
- Header-only C++ program options parser library☆176Updated 2 years ago
- Expressive Vector Engine - SIMD in C++ Goes Brrrr☆1,174Updated this week
- C++ multidimensional arrays in the spirit of the STL☆201Updated 3 months ago
- SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT☆718Updated this week
- A C++ compile-time math library using generalized constant expressions☆768Updated 9 months ago
- Intel TBB with CMake build system☆383Updated 2 years ago
- Vector class library, latest version☆1,352Updated last year
- Reference implementation of mdspan targeting C++23☆449Updated this week
- VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP☆710Updated 6 months ago
- C++ library for reading and writing of numpy's .npy files☆394Updated 6 months ago
- Yet Another Serialization☆755Updated last month
- The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using S…☆455Updated last week
- Portable (POSIX/Windows/Emscripten) thread pool for C/C++☆366Updated 9 months ago
- We make any object thread-safe and std::shared_mutex 10 times faster to achieve the speed of lock-free algorithms on >85% reads☆517Updated 2 years ago
- Conversion to/from half-precision floating point formats☆346Updated 8 months ago
- a c++/cuda template library for tensor lazy evaluation☆163Updated last year
- Open Source Parallel STL implementation☆524Updated last year
- Glob for C++17☆259Updated 11 months ago
- Performance benchmark framework for C++ with nanoseconds measure precision☆294Updated last year
- C++ Benchmark Authoring Library/Framework☆843Updated 3 months ago
- Simple, fast, accurate single-header microbenchmarking functionality for C++11/14/17/20☆1,518Updated 6 months ago
- A Compositional Numeric Library for C++☆650Updated 11 months ago
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆533Updated last month