superlich7 / caffe
This fork of BVLC/Caffe is dedicated to supporting Cambricon deep learning processor and improving performance of this deep learning framework when running on Machine Learning Unit(MLU).
☆41Updated 4 years ago
Alternatives and similar repositories for caffe:
Users that are interested in caffe are comparing it to the libraries listed below
- CNStream is a streaming framework for building Cambricon machine learning pipelines http://forum.cambricon.com https://gitee.com/Solu…☆49Updated last year
- examples for tvm schedule API☆99Updated last year
- ☆29Updated last year
- heterogeneity-aware-lowering-and-optimization☆254Updated last year
- Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .☆108Updated this week
- code reading for tvm☆74Updated 3 years ago
- TVM tutorial☆65Updated 6 years ago
- Subpart source code of of deepcore v0.7☆27Updated 4 years ago
- 动手学习TVM核心原理教程☆59Updated 4 years ago
- tophub autotvm log collections☆70Updated 2 years ago
- ☆26Updated 10 months ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆79Updated last year
- ☆36Updated 2 years ago
- A home for the final text of all TVM RFCs.☆102Updated 4 months ago
- To make it easy to benchmark AI accelerators☆183Updated 2 years ago
- Fast CUDA Kernels for ResNet Inference.☆171Updated 5 years ago
- Development repository for the Triton-Linalg conversion☆173Updated 2 weeks ago
- ☆95Updated 3 years ago
- GPU implementation of Winograd convolution☆10Updated 7 years ago
- Place for meetup slides☆140Updated 4 years ago
- ☆129Updated last month
- VeriSilicon Tensor Interface Module☆229Updated last month
- ☆142Updated last month
- benchmark for embededded-ai deep learning inference engines, such as NCNN / TNN / MNN / TensorFlow Lite etc.☆204Updated 4 years ago
- ☆17Updated 10 months ago
- Automatic Schedule Exploration and Optimization Framework for Tensor Computations☆176Updated 2 years ago
- Automated machine learning as an AI-HPC benchmark☆65Updated 2 years ago
- Fork of https://source.codeaurora.org/quic/hexagon_nn/nnlib☆57Updated last year
- Yinghan's Code Sample☆305Updated 2 years ago
- Code for testing the native float16 matrix multiplication performance on Tesla P100 and V100 GPU based on cublasHgemm☆34Updated 5 years ago