mz24cn / clnet
OpenCL for Nets - A Deep Learning Framework based on OpenCL, written by C++. Supports popular MLP, RNN(LSTM), CNN(ResNet). Friendly debugger. Transparent data. No library dependencies. 基于OpenCL的深度学习计算框架,C++开发,支持多层感知器,长短时记忆模型,卷积神经网络,残差网络。调试方便,数据透明。无外部依赖。
☆68Updated 5 years ago
Alternatives and similar repositories for clnet:
Users that are interested in clnet are comparing it to the libraries listed below
- Optimizing Mobile Deep Learning on ARM GPU with TVM☆181Updated 6 years ago
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 7 years ago
- ☆26Updated 8 years ago
- A Winograd based kernel for convolutions in deep learning framework☆15Updated 7 years ago
- Heterogeneous Run Time version of MXNet. Added heterogeneous capabilities to the MXNet, uses heterogeneous computing infrastructure frame…☆72Updated 7 years ago
- Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL☆135Updated 8 years ago
- Fork of https://source.codeaurora.org/quic/hexagon_nn/nnlib☆57Updated 2 years ago
- Heterogeneous Run Time version of TensorFlow. Added heterogeneous capabilities to the TensorFlow, uses heterogeneous computing infrastruc…☆36Updated 7 years ago
- flexible-gemm conv of deepcore☆17Updated 5 years ago
- Tengine gemm tutorial, step by step☆13Updated 4 years ago
- symmetric int8 gemm☆67Updated 4 years ago
- parallel algorithm based on cuda☆60Updated 7 years ago
- Tencent NCNN with added CUDA support☆69Updated 4 years ago
- Some C++ codes for computing a 1D and 2D convolution product using the FFT implemented with the GSL or FFTW☆58Updated 11 years ago
- Implement vgg16 model by ARM Compute Library☆32Updated 5 years ago
- how to design cpu gemm on x86 with avx256, that can beat openblas.☆70Updated 6 years ago
- OpenCL implementation of a NN and CNN☆22Updated 6 years ago
- ICML2017 MEC: Memory-efficient Convolution for Deep Neural Network C++实现(非官方)☆17Updated 6 years ago
- tensorflow c++ example for VS2015☆32Updated 6 years ago
- tutorial to optimize GEMM performance on android☆51Updated 9 years ago
- a c++/cuda template library for tensor lazy evaluation☆161Updated 2 years ago
- BLAS OpenCL implementation.☆15Updated 10 years ago
- An Example of MXNet Models Comilation and Deployment with NNVM in C++☆16Updated 7 years ago
- Winograd-based convolution implementation in OpenCL☆28Updated 8 years ago
- fastercnn modules optimize☆2Updated last year
- benchmark models for TNN, ncnn, MNN☆20Updated 4 years ago
- CNNs in Halide☆23Updated 9 years ago
- HiKey970开发板资料汇总☆27Updated 6 years ago
- Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )☆60Updated last month
- This is a CNN Analyzer tool, based on Netscope by dgschwend/netscope☆41Updated 7 years ago