Maratyszcza / FP16View external linksLinks
Conversion to/from half-precision floating point formats
☆380Aug 16, 2025Updated 5 months ago
Alternatives and similar repositories for FP16
Users that are interested in FP16 are comparing it to the libraries listed below
Sorting:
- C99/C++ header-only library for division via fixed-point multiplication by inverse☆59Apr 14, 2024Updated last year
- Portable (POSIX/Windows/Emscripten) thread pool for C/C++☆388Jun 16, 2024Updated last year
- Portable 128-bit SIMD intrinsics☆59Jul 4, 2023Updated 2 years ago
- Low-precision matrix multiplication☆1,832Jan 29, 2024Updated 2 years ago
- The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using S…☆485Oct 23, 2025Updated 3 months ago
- CPU INFOrmation library (x86/x86-64/ARM/ARM64, Linux/Windows/Android/macOS/iOS)☆1,153Jan 30, 2026Updated 2 weeks ago
- Acceleration package for neural networks on multi-core CPUs☆1,703Jun 11, 2024Updated last year
- ☆322Updated this week
- FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/☆1,530Updated this week
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,818Oct 9, 2023Updated 2 years ago
- High-efficiency floating-point neural network inference operators for mobile, server, and Web☆2,255Updated this week
- A toolchain file and examples using cmake for iOS development (this is a fork of a similar project found on code.google.com)☆26Nov 15, 2017Updated 8 years ago
- C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE, WebAssembly, VSX, RISC-…☆2,619Feb 4, 2026Updated last week
- common in-memory tensor structure☆1,166Jan 26, 2026Updated 2 weeks ago
- A CPU tool for benchmarking the peak of floating points☆576Feb 7, 2026Updated last week
- ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)☆17Apr 9, 2019Updated 6 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆1,006Sep 19, 2024Updated last year
- a language for fast, portable data-parallel computation☆6,569Updated this week
- oneAPI Deep Neural Network Library (oneDNN)☆3,960Updated this week
- ☆12Sep 29, 2017Updated 8 years ago
- row-major matmul optimization☆701Aug 20, 2025Updated 5 months ago
- Quantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators☆1,549Aug 28, 2019Updated 6 years ago
- Test winograd convolution written in TVM for CUDA and AMDGPU☆41Oct 12, 2018Updated 7 years ago
- The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologi…☆3,112Feb 6, 2026Updated last week
- Some deep learning models written with mxnet and C++11.☆12Feb 6, 2018Updated 8 years ago
- The network of the faceboxes☆15Sep 26, 2017Updated 8 years ago
- ☆1,988Jul 29, 2023Updated 2 years ago
- Efficient binary-decimal and decimal-binary conversion routines for IEEE doubles.☆1,183Feb 2, 2026Updated last week
- Single header library for creating image atlases.☆27Feb 5, 2022Updated 4 years ago
- CUDA Templates and Python DSLs for High-Performance Linear Algebra☆9,266Updated this week
- BLISlab: A Sandbox for Optimizing GEMM☆555Jun 17, 2021Updated 4 years ago
- SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT☆807Dec 25, 2025Updated last month
- Agenium Scale vectorization library for CPUs and GPUs☆337Oct 21, 2021Updated 4 years ago
- Tuned OpenCL BLAS☆1,166Feb 1, 2026Updated last week
- fast log and exp functions for AVX2/AVX-512☆240Mar 12, 2025Updated 11 months ago
- GLSL code generator to aid use of Vulkan's descriptor set indexing☆14Apr 20, 2019Updated 6 years ago
- symmetric int8 gemm☆67Jun 7, 2020Updated 5 years ago
- Train Neuronal networks to automate your home☆19Mar 1, 2023Updated 2 years ago
- C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, NEON for ARM.☆2,233Updated this week