taehokim20 / CPrune
CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
☆16Updated last year
Related projects: ⓘ
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆26Updated 4 years ago
- Post-training sparsity-aware quantization☆32Updated last year
- [ICCV 2021] Code release for "Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks"☆31Updated 2 years ago
- ☆113Updated last year
- Benchmark PyTorch Custom Operators☆13Updated last year
- Official implementation of "Searching for Winograd-aware Quantized Networks" (MLSys'20)☆26Updated 11 months ago
- Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.☆24Updated 4 years ago
- A Winograd Minimal Filter Implementation in CUDA☆20Updated 3 years ago
- This is the implementation for paper: AdaTune: Adaptive Tensor Program CompilationMade Efficient (NeurIPS 2020).☆13Updated 3 years ago
- Fast NPU-aware Neural Architecture Search☆21Updated 3 years ago
- TQT's pytorch implementation.☆20Updated 2 years ago
- ☆19Updated 5 months ago
- Code for High-Capacity Expert Binary Networks (ICLR 2021).☆26Updated 2 years ago
- A 8-/16-/32-/64-bit floating point number family☆15Updated 2 years ago
- BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.☆49Updated last year
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆18Updated last year
- Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming☆94Updated 3 years ago
- A Out-of-box PyTorch Scaffold for Neural Network Quantization-Aware-Training (QAT) Research. Website: https://github.com/zhutmost/neuralz…☆26Updated last year
- [CVPRW 2021] Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms☆29Updated last year
- ☆66Updated last year
- Code for "Fast Sparse ConvNets" CVPR2020 submissions☆13Updated 4 years ago
- This repository containts the pytorch scripts to train mixed-precision networks for microcontroller deployment, based on the memory contr…☆47Updated 4 months ago
- An external memory allocator example for PyTorch.☆13Updated 2 years ago
- ☆14Updated last week
- ☆36Updated 5 years ago
- ☆38Updated 4 years ago
- You Only Search Once: On Lightweight Differentiable Architecture Search for Resource-Constrained Embedded Platforms☆10Updated last year
- ☆17Updated 3 years ago
- ☆33Updated 2 years ago