andravin / wincnn
Winograd minimal convolution algorithm generator for convolutional neural networks.
☆600Updated 3 years ago
Related projects: ⓘ
- Efficient Sparse-Winograd Convolutional Neural Networks (ICLR 2018)☆190Updated 5 years ago
- Caffe for Sparse Convolutional Neural Network☆238Updated last year
- Ristretto: Caffe-based approximation of convolutional neural networks.☆292Updated 5 years ago
- Fast CUDA Kernels for ResNet Inference.☆164Updated 5 years ago
- Caffe Implementation for Incremental network quantization☆191Updated 6 years ago
- An efficient framework for convolutional neural networks☆274Updated last year
- collection of works aiming at reducing model sizes or the ASIC/FPGA accelerator for machine learning☆552Updated 7 months ago
- Training Deep Neural Networks with binary weights during propagations☆377Updated 8 years ago
- (New version is out: https://github.com/hpi-xnor/BMXNet-v2) BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet☆350Updated 4 years ago
- Quantization of Convolutional Neural networks.☆237Updated last month
- Explore the energy-efficient dataflow scheduling for neural networks.☆214Updated 4 years ago
- Automatic Schedule Exploration and Optimization Framework for Tensor Computations☆175Updated 2 years ago
- BLISlab: A Sandbox for Optimizing GEMM☆466Updated 3 years ago
- BinaryNets in TensorFlow with XNOR GEMM op☆154Updated 7 years ago
- ☆405Updated 5 years ago
- Generate a quantization parameter file for ncnn framework int8 inference☆519Updated 4 years ago
- Neural network visualizer and analyzer☆164Updated 5 years ago
- TVM integration into PyTorch☆452Updated 4 years ago
- tophub autotvm log collections☆70Updated last year
- Low-precision matrix multiplication☆1,772Updated 7 months ago
- Place for meetup slides☆140Updated 3 years ago
- Deep Compression on AlexNet☆655Updated 2 years ago
- Caffe for Sparse and Low-rank Deep Neural Networks☆376Updated 4 years ago
- benchmark for embededded-ai deep learning inference engines, such as NCNN / TNN / MNN / TensorFlow Lite etc.☆202Updated 3 years ago
- Caffe implementation of accurate low-precision neural networks☆118Updated 5 years ago
- A CUDNN minimal deep learning training code sample using LeNet.☆257Updated last year
- Assembler for NVIDIA Maxwell architecture☆940Updated last year
- A CPU tool for benchmarking the peak of floating points☆472Updated this week
- Simple Training and Deployment of Fast End-to-End Binary Networks☆159Updated 2 years ago
- CNN accelerated by cuda. Test on mnist and finilly get 99.76%☆181Updated 6 years ago