jordan-g / PyTorch-cuDNN-ConvolutionLinks

PyTorch extension enabling direct access to cuDNN-accelerated C++ convolution functions.

☆13

Alternatives and similar repositories for PyTorch-cuDNN-Convolution

Users that are interested in PyTorch-cuDNN-Convolution are comparing it to the libraries listed below

Sorting:

Guangxuan-Xiao / torch-int
This repository contains integer operators on GPUs for PyTorch.
☆206Updated last year
mit-han-lab / inter-operator-scheduler
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
☆200Updated 3 years ago
cmu-catalyst / collage
System for automated integration of deep learning backends.
☆47Updated 2 years ago
naver-aics / lut-gemm
☆61Updated last year
cjf00000 / StatQuant
code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"
☆28Updated 4 years ago
aojunzz / NM-sparsity
☆237Updated 2 years ago
pku-liang / FlexTensor
Automatic Schedule Exploration and Optimization Framework for Tensor Computations
☆177Updated 3 years ago
microsoft / microxcaling
PyTorch emulation library for Microscaling (MX)-compatible data formats
☆257Updated last month
cornell-zhang / dnn-quant-ocs
DNN quantization with outlier channel splitting
☆113Updated 5 years ago
xuqiantong / CUDA-Winograd
Fast CUDA Kernels for ResNet Inference.
☆177Updated 6 years ago
thu-pacman / PET
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆121Updated 3 years ago
awslabs / raf
☆145Updated 5 months ago
apache / tvm-rfcs
A home for the final text of all TVM RFCs.
☆105Updated 9 months ago
HPDL-Group / Merak
☆80Updated 2 months ago
facebookexperimental / triton
Github mirror of trition-lang/triton repo.
☆48Updated this week
tlc-pack / tenset
☆92Updated 2 years ago
microsoft / SparTA
☆149Updated 11 months ago
pku-liang / AMOS
Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators
☆113Updated 2 years ago
masahi / tvm-cutlass-eval
☆40Updated 3 years ago
BoyuanFeng / APNN-TC
☆19Updated 3 years ago
UofT-EcoSystem / DietCode
DietCode Code Release
☆64Updated 2 years ago
yifuwang / symm-mem-recipes
☆100Updated 6 months ago
ceruleangu / Block-Sparse-Benchmark
Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.
☆24Updated 4 years ago
ColfaxResearch / cfx-article-src
☆124Updated 2 months ago
Tiiiger / QPyTorch
Low Precision Arithmetic Simulation in PyTorch
☆280Updated last year
Qualcomm-AI-research / FP8-quantization
☆152Updated 2 years ago
msr-fiddle / piper
☆9Updated 3 years ago
tlc-pack / TLCBench
Benchmark scripts for TVM
☆74Updated 3 years ago
submission2019 / cnn-quantization
Quantization of Convolutional Neural networks.
☆244Updated 11 months ago
amirgholami / ai_and_memory_wall
AI and Memory Wall
☆216Updated last year