Conversion to/from half-precision floating point formats
☆381Aug 16, 2025Updated 6 months ago
Alternatives and similar repositories for FP16
Users that are interested in FP16 are comparing it to the libraries listed below
Sorting:
- C99/C++ header-only library for division via fixed-point multiplication by inverse☆60Apr 14, 2024Updated last year
- Portable (POSIX/Windows/Emscripten) thread pool for C/C++☆387Jun 16, 2024Updated last year
- Portable 128-bit SIMD intrinsics☆59Jul 4, 2023Updated 2 years ago
- Low-precision matrix multiplication☆1,831Jan 29, 2024Updated 2 years ago
- The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using S…☆489Oct 23, 2025Updated 4 months ago
- CPU INFOrmation library (x86/x86-64/ARM/ARM64, Linux/Windows/Android/macOS/iOS)☆1,155Feb 18, 2026Updated 2 weeks ago
- Acceleration package for neural networks on multi-core CPUs☆1,701Jun 11, 2024Updated last year
- ☆321Feb 17, 2026Updated 2 weeks ago
- FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/☆1,535Updated this week
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,820Oct 9, 2023Updated 2 years ago
- High-efficiency floating-point neural network inference operators for mobile, server, and Web☆2,267Updated this week
- A toolchain file and examples using cmake for iOS development (this is a fork of a similar project found on code.google.com)☆26Nov 15, 2017Updated 8 years ago
- C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE, WebAssembly, VSX, RISC-…☆2,638Feb 27, 2026Updated last week
- common in-memory tensor structure☆1,171Jan 26, 2026Updated last month
- A CPU tool for benchmarking the peak of floating points☆579Feb 7, 2026Updated last month
- ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)☆17Apr 9, 2019Updated 6 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆1,005Sep 19, 2024Updated last year
- a language for fast, portable data-parallel computation☆6,577Updated this week
- oneAPI Deep Neural Network Library (oneDNN)☆3,958Updated this week
- ☆12Sep 29, 2017Updated 8 years ago
- row-major matmul optimization☆707Feb 24, 2026Updated last week
- Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators☆1,547Aug 28, 2019Updated 6 years ago
- Test winograd convolution written in TVM for CUDA and AMDGPU☆41Oct 12, 2018Updated 7 years ago
- The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologi…☆3,120Feb 27, 2026Updated last week
- The network of the faceboxes☆15Sep 26, 2017Updated 8 years ago
- Some deep learning models written with mxnet and C++11.☆12Feb 6, 2018Updated 8 years ago
- ☆1,992Jul 29, 2023Updated 2 years ago
- Efficient binary-decimal and decimal-binary conversion routines for IEEE doubles.☆1,182Feb 2, 2026Updated last month
- Single header library for creating image atlases.☆27Feb 5, 2022Updated 4 years ago
- CUDA Templates and Python DSLs for High-Performance Linear Algebra☆9,348Updated this week
- BLISlab: A Sandbox for Optimizing GEMM☆557Jun 17, 2021Updated 4 years ago
- SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT☆812Dec 25, 2025Updated 2 months ago
- Agenium Scale vectorization library for CPUs and GPUs☆338Oct 21, 2021Updated 4 years ago
- Tuned OpenCL BLAS☆1,168Feb 1, 2026Updated last month
- fast log and exp functions for AVX2/AVX-512☆241Mar 12, 2025Updated 11 months ago
- symmetric int8 gemm☆67Jun 7, 2020Updated 5 years ago
- Train Neuronal networks to automate your home☆19Mar 1, 2023Updated 3 years ago
- GLSL code generator to aid use of Vulkan's descriptor set indexing☆14Apr 20, 2019Updated 6 years ago
- C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, NEON for ARM.☆2,237Updated this week