suruoxi / half
IEEE 754-based c++ half-precision floating point library forked from http://half.sourceforge.net
☆23Updated 3 years ago
Alternatives and similar repositories for half:
Users that are interested in half are comparing it to the libraries listed below
- C99/C++ header-only library for division via fixed-point multiplication by inverse☆49Updated 10 months ago
- Portable 128-bit SIMD intrinsics☆57Updated last year
- Software implementation of ARM and x86 SIMD intrinsics☆13Updated 5 years ago
- UME::SIMD A library for explicit simd vectorization.☆91Updated 7 years ago
- Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.☆71Updated 9 years ago
- A framework that helps implementing swizzle GPU kernels☆42Updated 4 years ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆83Updated 11 months ago
- Emulating DMA Engines on GPUs for Performance and Portability☆37Updated 9 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆110Updated 8 months ago
- ☆28Updated 2 years ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆80Updated 5 years ago
- Conversion to/from half-precision floating point formats☆341Updated 6 months ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- how to design cpu gemm on x86 with avx256, that can beat openblas.☆67Updated 5 years ago
- An extension library of WMMA API (Tensor Core API)☆88Updated 7 months ago
- Bridging polyhedral analysis tools to the MLIR framework☆107Updated last year
- SYCL Reference Manual☆27Updated 9 months ago
- AVX-optimized sin(), cos(), exp() and log() functions☆117Updated 3 years ago
- Realtime GPU Profiler for AMD / NVIDIA / Intel GPUs☆32Updated last year
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆113Updated 2 years ago
- MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com☆38Updated last year
- C++ implementation of a 16 bit floating-point type mimicking most of the IEEE 754 behaviour. Compatible with the half data type used as t…☆141Updated 12 years ago
- ☆67Updated 2 years ago
- development repository for the open earth compiler☆79Updated 4 years ago
- Conversions to MLIR EmitC☆126Updated 2 months ago
- SYCL Conformance Tests☆67Updated last week
- CNNs in Halide☆23Updated 9 years ago
- A header only library implementing common mathematical functions using SIMD intrinsics☆97Updated this week
- Polyhedral Parallel Code Generation (source repository: http://repo.or.cz/ppcg.git)☆121Updated 2 years ago
- BGHT: High-performance static GPU hash tables.☆61Updated 5 months ago