jordan-g / PyTorch-cuDNN-Convolution
PyTorch extension enabling direct access to cuDNN-accelerated C++ convolution functions.
☆13Updated 4 years ago
Alternatives and similar repositories for PyTorch-cuDNN-Convolution
Users that are interested in PyTorch-cuDNN-Convolution are comparing it to the libraries listed below
Sorting:
- System for automated integration of deep learning backends.☆48Updated 2 years ago
- code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"☆28Updated 4 years ago
- ☆55Updated last year
- This repository contains integer operators on GPUs for PyTorch.☆204Updated last year
- ☆38Updated 3 years ago
- DNN quantization with outlier channel splitting☆112Updated 5 years ago
- ☆42Updated 2 years ago
- ☆9Updated 3 years ago
- ☆72Updated 4 years ago
- An Efficient Pipelined Data Parallel Approach for Training Large Model☆76Updated 4 years ago
- ☆79Updated 3 weeks ago
- Post-training sparsity-aware quantization☆34Updated 2 years ago
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆110Updated 5 months ago
- Github mirror of trition-lang/triton repo.☆26Updated this week
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆50Updated 9 months ago
- Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.☆63Updated last month
- ☆230Updated 2 years ago
- ☆76Updated 4 months ago
- FTPipe and related pipeline model parallelism research.☆41Updated 2 years ago
- Benchmark scripts for TVM☆74Updated 3 years ago
- ☆146Updated 2 years ago
- BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization (ICLR 2021)☆40Updated 4 years ago
- ☆47Updated 3 years ago
- Measuring and predicting on-device metrics (latency, power, etc.) of machine learning models☆66Updated 2 years ago
- Code for ICML 2021 submission☆34Updated 4 years ago
- ☆36Updated 2 years ago
- llama INT4 cuda inference with AWQ☆54Updated 3 months ago
- ☆20Updated 3 years ago
- Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming☆96Updated 3 years ago
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆61Updated last year