andravin / wincnnLinks
Winograd minimal convolution algorithm generator for convolutional neural networks.
☆626Updated 5 years ago
Alternatives and similar repositories for wincnn
Users that are interested in wincnn are comparing it to the libraries listed below
Sorting:
- Efficient Sparse-Winograd Convolutional Neural Networks (ICLR 2018)☆193Updated 6 years ago
- Ristretto: Quantization and compression of large AI models. Author: Philipp Gysel.☆288Updated last week
- collection of works aiming at reducing model sizes or the ASIC/FPGA accelerator for machine learning☆566Updated 2 years ago
- Fast CUDA Kernels for ResNet Inference.☆182Updated 6 years ago
- Caffe Implementation for Incremental network quantization☆191Updated 7 years ago
- Neural network visualizer and analyzer☆164Updated 7 years ago
- Caffe implementation of accurate low-precision neural networks☆119Updated 7 years ago
- Winograd-based convolution implementation in OpenCL☆28Updated 9 years ago
- Optimizing Mobile Deep Learning on ARM GPU with TVM☆182Updated 7 years ago
- An efficient framework for convolutional neural networks☆278Updated 2 years ago
- Quantization of Convolutional Neural networks.☆250Updated last year
- (New version is out: https://github.com/hpi-xnor/BMXNet-v2) BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet☆351Updated 6 years ago
- tophub autotvm log collections☆69Updated 3 years ago
- TVM integration into PyTorch☆456Updated 6 years ago
- BinaryNets in TensorFlow with XNOR GEMM op☆154Updated 8 years ago
- Generate a quantization parameter file for ncnn framework int8 inference☆519Updated 5 years ago
- BLISlab: A Sandbox for Optimizing GEMM☆555Updated 4 years ago
- Caffe for Sparse and Low-rank Deep Neural Networks☆383Updated 5 years ago
- Graph Transforms to Quantize and Retrain Deep Neural Nets in TensorFlow☆168Updated 6 years ago
- Low-precision matrix multiplication☆1,829Updated 2 years ago
- High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices.☆535Updated 3 years ago
- Implementation of convolution layer in different flavors☆68Updated 8 years ago
- Subpart source code of of deepcore v0.7☆27Updated 5 years ago
- Training Deep Neural Networks with binary weights during propagations☆382Updated 9 years ago
- Heterogeneous Run Time version of Caffe. Added heterogeneous capabilities to the Caffe, uses heterogeneous computing infrastructure frame…☆269Updated 7 years ago
- An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.☆400Updated 2 years ago
- Explore the energy-efficient dataflow scheduling for neural networks.☆233Updated 5 years ago
- implementation of winograd minimal convolution algorithm on Intel Architecture☆39Updated 8 years ago
- Automatic Schedule Exploration and Optimization Framework for Tensor Computations☆183Updated 3 years ago
- A CUDNN minimal deep learning training code sample using LeNet.☆269Updated 2 years ago