jordan-g / PyTorch-cuDNN-ConvolutionLinks
PyTorch extension enabling direct access to cuDNN-accelerated C++ convolution functions.
☆13Updated 4 years ago
Alternatives and similar repositories for PyTorch-cuDNN-Convolution
Users that are interested in PyTorch-cuDNN-Convolution are comparing it to the libraries listed below
Sorting:
- This repository contains integer operators on GPUs for PyTorch.☆223Updated 2 years ago
- [MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration☆200Updated 3 years ago
- DNN quantization with outlier channel splitting (ICML'19)☆113Updated 5 years ago
- ☆243Updated 3 years ago
- Automatic Schedule Exploration and Optimization Framework for Tensor Computations☆180Updated 3 years ago
- ☆80Updated last year
- Quantization of Convolutional Neural networks.☆249Updated last year
- ☆166Updated 2 years ago
- code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"☆29Updated 5 years ago
- PyTorch emulation library for Microscaling (MX)-compatible data formats☆319Updated 5 months ago
- System for automated integration of deep learning backends.☆47Updated 3 years ago
- Low Precision Arithmetic Simulation in PyTorch☆286Updated last year
- PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.☆263Updated 2 years ago
- Fast CUDA Kernels for ResNet Inference.☆182Updated 6 years ago
- Pytorch implementation of BRECQ, ICLR 2021☆284Updated 4 years ago
- ☆41Updated 3 years ago
- ☆80Updated 6 months ago
- Repository for SysML19 Artifacts Evaluation☆54Updated 6 years ago
- [CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework☆279Updated last year
- ☆19Updated 4 years ago
- A home for the final text of all TVM RFCs.☆109Updated last year
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆117Updated 3 years ago
- The PyTorch implementation of Learned Step size Quantization (LSQ) in ICLR2020 (unofficial)☆138Updated 5 years ago
- Artifact repository for paper Automatic Generation of High-Performance Quantized Machine Learning Kernels☆17Updated 5 years ago
- Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming☆98Updated 4 years ago
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆112Updated 11 months ago
- ☆32Updated 3 years ago
- ☆147Updated 11 months ago
- ☆49Updated 3 years ago
- DietCode Code Release☆65Updated 3 years ago