ACANETS / eece-6540-labs
labs and exercises for EECE.6540 Heterogeneous Computing at UMass Lowell
☆13Updated last year
Related projects ⓘ
Alternatives and complementary repositories for eece-6540-labs
- This is the implementation for paper: AdaTune: Adaptive Tensor Program CompilationMade Efficient (NeurIPS 2020).☆13Updated 3 years ago
- An external memory allocator example for PyTorch.☆13Updated 3 years ago
- ☆23Updated 3 years ago
- ☆19Updated last year
- The code for paper: Neuralpower: Predict and deploy energy-efficient convolutional neural networks☆21Updated 5 years ago
- ☆16Updated 4 years ago
- Benchmark PyTorch Custom Operators☆13Updated last year
- Benchmark scripts for TVM☆73Updated 2 years ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆19Updated last year
- [TCAD 2021] Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA☆16Updated 2 years ago
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation☆26Updated 5 years ago
- MLSys 2021 paper: MicroRec: efficient recommendation inference by hardware and data structure solutions☆15Updated 3 years ago
- ☆18Updated last month
- ☆18Updated 2 years ago
- ☆15Updated 3 years ago
- ☆30Updated last year
- Accelerating CNN's convolution operation on GPUs by using memory-efficient data access patterns.☆14Updated 6 years ago
- Course Webpage for CS 217 Hardware Accelerators for Machine Learning, Stanford University☆98Updated last year
- [FPGA'21] CoDeNet is an efficient object detection model on PyTorch, with SOTA performance on VOC and COCO based on CenterNet and Co-Desi…☆25Updated last year
- This is the open-source version of TinyTS. The code is dirty so far. We may clean the code in the future.☆11Updated 4 months ago
- A 8-/16-/32-/64-bit floating point number family☆16Updated 2 years ago
- ☆13Updated 3 years ago
- The code for our paper "Neural Architecture Search as Program Transformation Exploration"☆18Updated 3 years ago
- Benchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.☆24Updated 4 years ago
- Post-training sparsity-aware quantization☆33Updated last year
- This is a repo which contains some details about how to use OpenCL backend (Xilinx/Intel).☆24Updated 5 years ago