headupinclouds / half
half precision floating point library (fork)
☆7Updated 9 years ago
Alternatives and similar repositories for half:
Users that are interested in half are comparing it to the libraries listed below
- A portable high-level API with CUDA or OpenCL back-end☆54Updated 7 years ago
- BLAS OpenCL implementation.☆15Updated 9 years ago
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 7 years ago
- Mxnet Implementation of Google's MobileNets v2☆11Updated 6 years ago
- Communication-Minimizing 2D Convolution in GPU Registers☆30Updated 11 years ago
- Proof-of-Concept CNN in Halide☆22Updated 8 years ago
- tutorial to optimize GEMM performance on android☆51Updated 8 years ago
- Matrix Shadow:Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning☆33Updated 8 years ago
- Fast binary matrix product on CPU☆10Updated 8 years ago
- SRCNN - Super-resolution using convolutional neural networks. Uses OpenCL to execute on the GPU.☆29Updated 9 years ago
- a C++ wrapper of Caffe and mxnet to make predictions☆50Updated 6 years ago
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 5 years ago
- This is a demo project that shows how you can utilize Caffe2's modular design and build a library on top of it.☆40Updated 5 years ago
- Set of basic classes (vector, matrix, images and memory array) for CPU and GPU☆17Updated 3 years ago
- Deep neural network framework (C/C++/CUDA).☆31Updated 9 years ago
- ☆23Updated 8 years ago
- slic video segmentation☆10Updated 9 years ago
- SqueezeNet Generator☆31Updated 6 years ago
- ONNX Parser is a tool that automatically generates openvx inference code (CNN) from onnx binary model files.☆17Updated 6 years ago
- Torch FFI-bindings for NNPACK☆30Updated 7 years ago
- Optimized Gaussian blur filter on CPU.☆17Updated 7 years ago
- ☆37Updated 9 years ago
- Convolutional neural networks C++ framework with CPU and GPU (CUDA) backends☆178Updated 6 years ago
- Binarized Neural Network☆9Updated 8 years ago
- Sublinear memory optimization for deep learning, reduce GPU memory cost to train deeper nets☆28Updated 8 years ago
- Example code used in the CVPR 2015 tutorial☆39Updated 9 years ago
- A framework for neural network☆9Updated 6 years ago
- A Neural Algorithm of Artistic Style☆29Updated 8 years ago
- Boda: A C++ Framework for Efficient Experiments in Computer Vision☆63Updated 5 years ago
- Depth_conv for MobileNet☆30Updated 4 years ago