praveen-oak / max-pool-cuda
Implemented the max pool filter in CUDA using the in built library and using shared memory
☆7Updated 5 years ago
Related projects: ⓘ
- Artifact for IPDPS'21: DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.☆13Updated 3 years ago
- All about acceleration and compression of Deep Neural Networks☆33Updated 4 years ago
- Code for our ICLR'2021 paper "DrNAS: Dirichlet Neural Architecture Search"☆42Updated 3 years ago
- [NeurIPS 2021] “Stronger NAS with Weaker Predictors“, Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang W…☆27Updated last year
- Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.☆24Updated 4 years ago
- Official PyTorch Implementation of "Learning Architectures for Binary Networks" (ECCV2020)☆26Updated 3 years ago
- ☆42Updated 7 months ago
- Codes for Accepted Paper : "MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization" in NeurIPS 2019☆55Updated 4 years ago
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021☆54Updated 3 years ago
- Generic Neural Architecture Search via Regression (NeurIPS'21 Spotlight)☆36Updated 2 years ago
- ☆70Updated 4 years ago
- [CVPRW 2021] Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms☆29Updated last year
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆26Updated 4 years ago
- A PyTorch implementation of NASBench☆52Updated last year
- Class Project for 18663 - Implementation of FBNet (Hardware-Aware DNAS)☆32Updated 4 years ago
- [CVPR 2021] 'Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator'☆38Updated 2 years ago
- AutoGrow: Automatic Layer Growing in Deep Convolutional Networks (KDD 2020)☆39Updated 5 years ago
- ☆68Updated 3 years ago
- ☆40Updated 11 months ago
- Algorithm-hardware Co-design for Deformable Convolution☆24Updated 3 years ago
- ☆42Updated 4 years ago
- A Hackable Quantization Library for PyTorch☆18Updated 3 years ago
- NAS Benchmark in "Prioritized Architecture Sampling with Monto-Carlo Tree Search", CVPR2021☆38Updated 3 years ago
- This is a PyTorch implementation of the Scalpel. Node pruning for five benchmark networks and SIMD-aware weight pruning for LeNet-300-100…☆38Updated 5 years ago
- Personal Digest of NAS (Under Construction 🛠)☆25Updated 3 years ago
- Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight…☆59Updated last month
- ☆47Updated 4 years ago
- Code for ICML 2021 submission☆35Updated 3 years ago
- ☆33Updated 2 years ago
- Codes for Understanding Architectures Learnt by Cell-based Neural Architecture Search☆27Updated 4 years ago