0xBYTESHIFT / fp16

class that represents 16-bit floating point (half)

☆11

Alternatives and similar repositories for fp16:

Users that are interested in fp16 are comparing it to the libraries listed below

dpuyda / scheduling
A simple and fast minimalistic header-only library allowing to run async tasks and execute task graphs.
☆53Updated 4 months ago
Maratyszcza / FXdiv
C99/C++ header-only library for division via fixed-point multiplication by inverse
☆50Updated last year
jrmadsen / PTL
Parallel Tasking Library (PTL) - Lightweight C++11 mutilthreading tasking system featuring thread-pool, task-groups, and lock-free task q…
☆45Updated 5 months ago
JishinMaster / simd_utils
A header only library implementing common mathematical functions using SIMD intrinsics
☆103Updated 2 months ago
MMLab-CU / CLUE
C++ Lightweight Utility Extensions
☆75Updated 3 years ago
eyalroz / gpu-kernel-runner
Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line
☆23Updated this week
rkuang9 / FLARE
A C++ neural network library for machine learning
☆14Updated 11 months ago
csukuangfj / OpenCNN
An Open Convolutional Neural Network Framework in C++ From Scratch
☆61Updated 4 years ago
yhirose / cpp-fstlib
A single file C++17 header-only Minimal Acyclic Subsequential Transducers, or Finite State Transducers
☆55Updated 2 years ago
syoyo / safetensors-cpp
Header-only safetensors loader and saver in C++
☆56Updated last week
Maratyszcza / psimd
Portable 128-bit SIMD intrinsics
☆58Updated last year
felixguendling / cpp-serialization-benchmark
Comparison of C++ Serialization Libraries for Graph Data
☆34Updated 3 years ago
fengwang / float16_t
CPP20 implementation of a 16-bit floating-point type mimicking most of the IEEE 754 behavior. Single file and header-only.
☆41Updated last year
istmarc / tenseur
C++20 Tensor library
☆26Updated 3 months ago
frozein / QuickMathHPP
a single-header math library
☆16Updated 6 months ago
carlushuang / cpu_gemm_opt
how to design cpu gemm on x86 with avx256, that can beat openblas.
☆70Updated 6 years ago
milakov / int_fastdiv
Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.
☆70Updated 9 years ago
CNugteren / CLCudaAPI
A portable high-level API with CUDA or OpenCL back-end
☆54Updated 7 years ago
OpenPPL / ppl.common
Common libraries for PPL projects
☆29Updated last month
p-ranav / task_system
Task System presented in "Better Code: Concurrency - Sean Parent"
☆42Updated 4 years ago
berenger-eu / farm-sve
The Farm-SVE package provides a header that implements the ARM C language extensions (ACLE) for the ARM Scalable Vector Extension (SVE) i…
☆14Updated last year
xboxfanj / math-neon
Automatically exported from code.google.com/p/math-neon
☆40Updated 10 years ago
reyoung / avx_mathfun
AVX-optimized sin(), cos(), exp() and log() functions
☆123Updated 3 years ago
mklarqvist / libalgebra
Fast C header-only library for popcnt, pospopcnt, and set algebraic operations
☆45Updated 5 years ago
Bruce-Lee-LY / memory_pool
Simple and efficient memory pool is implemented with C++11.
☆8Updated 2 years ago
fengwang / matrix
A modern, C++20-native, single-file header-only dense 2D matrix library.
☆87Updated last year
Maratyszcza / FP16
Conversion to/from half-precision floating point formats
☆347Updated 8 months ago
jinmingyi1998 / opencl_kernels
An easy way to run, test, benchmark and tune OpenCL kernel files
☆23Updated last year
Glavnokoman / vulkan-compute-example
Simple example of using Vulkan for GPGPU computing
☆53Updated 6 years ago
niswegmann / small-matrix-inverse
SIMD optimised library for matrix inversion of 2x2, 3x3, and 4x4 matrices.
☆93Updated 9 years ago