DLTcollab / sse2neon
A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
☆1,308Updated last week
Related projects ⓘ
Alternatives and complementary repositories for sse2neon
- Implementations of SIMD instruction sets for systems which don't natively support them.☆2,406Updated this week
- The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using S…☆430Updated 2 months ago
- C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))☆2,211Updated last week
- Makes ARM NEON documentation accessible (with examples)☆382Updated 7 months ago
- C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, NEON for ARM.☆2,070Updated this week
- SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT☆667Updated last week
- Encapsulate the frequently used AVX instructions as independent modules to reduce repeated development workload.☆114Updated 10 months ago
- Automatically exported from code.google.com/p/sse2neon☆285Updated 4 years ago
- C++ template library for high performance SIMD based sorting algorithms☆887Updated last week
- Intel® Implicit SPMD Program Compiler☆2,520Updated this week
- [ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl☆2,294Updated 9 months ago
- Vector class library, latest version☆1,308Updated 9 months ago
- Portable header-only C++ low level SIMD library☆1,242Updated 2 months ago
- std::simd for GCC [ISO/IEC TS 19570:2018]☆579Updated last year
- An open optimized software library project for the ARM® Architecture☆1,462Updated last year
- Agenium Scale vectorization library for CPUs and GPUs☆328Updated 3 years ago
- A cross platform C99 library to get cpu features at runtime.☆2,465Updated last week
- Speed-up over 50% in average vs traditional memcpy in gcc 4.9 or vc2012☆592Updated 7 months ago
- SIMD Vector Classes for C++☆1,458Updated 5 months ago
- Optimized implementations of various library functions for ARM architecture processors☆601Updated this week
- zlib replacement with optimizations for "next generation" systems.☆1,577Updated 2 weeks ago
- Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C+…☆1,390Updated this week
- oneAPI Threading Building Blocks (oneTBB)☆5,732Updated this week
- CPU INFOrmation library (x86/x86-64/ARM/ARM64, Linux/Windows/Android/macOS/iOS)☆1,016Updated last week
- BS: a fast, lightweight, and easy-to-use C++17 thread pool library☆2,211Updated 6 months ago
- Fast and exact implementation of the C++ from_chars functions for number types: 4x to 10x faster than strtod, part of GCC 12, Chromium, R…☆1,588Updated 2 weeks ago
- Expressive Vector Engine - SIMD in C++ Goes Brrrr☆964Updated this week
- Public domain cross platform lock free thread caching 16-byte aligned memory allocator implemented in C☆2,168Updated 4 months ago
- A family of header-only, very fast and memory-friendly hashmap and btree containers.☆2,556Updated 2 weeks ago
- Official git repository for libdivide: optimized integer division☆1,106Updated 2 weeks ago