agenium-scale / nsimdView external linksLinks
Agenium Scale vectorization library for CPUs and GPUs
☆337Oct 21, 2021Updated 4 years ago
Alternatives and similar repositories for nsimd
Users that are interested in nsimd are comparing it to the libraries listed below
Sorting:
- C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE, WebAssembly, VSX, RISC-…☆2,619Feb 4, 2026Updated last week
- std::simd for GCC [ISO/IEC TS 19570:2018]☆639Mar 10, 2023Updated 2 years ago
- SIMD Vector Classes for C++☆1,515Jun 6, 2024Updated last year
- Expressive Vector Engine - SIMD in C++ Goes Brrrr☆1,295Feb 7, 2026Updated last week
- Boost SIMD☆230Apr 10, 2019Updated 6 years ago
- Portable header-only C++ low level SIMD library☆1,300Aug 26, 2024Updated last year
- Portable wrapper for SIMD and vector instructions written in C++11. Compatible with NEON, SSE, AVX, AVX-512 and SVE (length specific).☆518Dec 4, 2025Updated 2 months ago
- SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT☆807Dec 25, 2025Updated last month
- Implementations of SIMD instruction sets for systems which don't natively support them.☆2,944Updated this week
- Vector class library, latest version☆1,432Feb 1, 2024Updated 2 years ago
- A lightweight high performance tensor algebra framework for modern C++☆830Jul 8, 2025Updated 7 months ago
- Performance-portable, length-agnostic SIMD with runtime dispatch☆5,317Jan 29, 2026Updated 2 weeks ago
- C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, NEON for ARM.☆2,233Updated this week
- parser combinator and AST generator in c++17☆24Feb 16, 2023Updated 2 years ago
- UME::SIMD A library for explicit simd vectorization.☆91Jan 19, 2018Updated 8 years ago
- Fundamental C++ SIMD types for Intel CPUs (sse, avx, avx2, avx512)☆360Jun 28, 2021Updated 4 years ago
- An alternative to Boost.MPI for a user friendly C++ interface for MPI (MPICH).☆19Feb 24, 2018Updated 7 years ago
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆97Dec 4, 2025Updated 2 months ago
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆935Feb 7, 2026Updated last week
- Portable 128-bit SIMD intrinsics☆59Jul 4, 2023Updated 2 years ago
- WIP · CUDA compatibility for Blaze · https://bitbucket.org/blaze-lib/blaze☆21Nov 18, 2019Updated 6 years ago
- GPU Automatically Tuned Linear Algebra Software☆28Sep 1, 2015Updated 10 years ago
- Library for length agnostic SIMD intrinsic support and the corresponding math operations☆21Nov 15, 2021Updated 4 years ago
- Thin, unified, C++-flavored wrappers for the CUDA APIs☆872Feb 2, 2026Updated last week
- C++ implementation of a fast hash map and hash set using robin hood hashing☆1,448Nov 2, 2025Updated 3 months ago
- The C++ Standard Library for Parallelism and Concurrency☆2,790Updated this week
- C++ tensors with broadcasting and lazy computing☆3,703Jan 29, 2026Updated 2 weeks ago
- A streamlined CMake build system foundation for developing HPC software☆284Feb 5, 2026Updated last week
- Abstraction Library for Parallel Kernel Acceleration☆404Jan 29, 2026Updated 2 weeks ago
- [ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl☆2,309Feb 7, 2024Updated 2 years ago
- A Compositional Numeric Library for C++☆684Apr 26, 2024Updated last year
- A Low-Level Abstraction of Memory Access☆93Feb 29, 2024Updated last year
- The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using S…☆485Oct 23, 2025Updated 3 months ago
- A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation☆1,480Jan 19, 2026Updated 3 weeks ago
- stdgpu: Efficient STL-like Data Structures on the GPU☆1,252Feb 7, 2026Updated last week
- pocl - Portable Computing Language☆1,050Updated this week
- A C++ compile-time math library using generalized constant expressions☆813Jun 22, 2024Updated last year
- [ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl☆4,998Feb 8, 2024Updated 2 years ago
- contiguous container library - arrays with customizable allocation, small buffer optimization and more☆258Mar 25, 2020Updated 5 years ago