KULeuven-MICAS / htvm
Efficient Neural Network Deployment on Heterogenous TinyML Platforms
☆13Updated 11 months ago
Related projects: ⓘ
- HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond☆12Updated last month
- Multi-core HW accelerator mapping optimization framework for layer-fused ML workloads.☆34Updated this week
- HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators☆99Updated last week
- An Open Workflow to Build Custom SoCs and run Deep Models at the Edge☆58Updated last month
- An FPGA accelerator for general-purpose Sparse-Matrix Dense-Matrix Multiplication (SpMM).☆62Updated last month
- A systolic array simulator for multi-cycle MACs and varying-byte words, with the paper accepted to HPCA 2022.☆60Updated 2 years ago
- ☆38Updated last week
- ☆81Updated 3 months ago
- ☆65Updated last year
- An open-source parameterizable NPU generator with full-stack multi-target compilation stack for intelligent workloads.☆20Updated 5 months ago
- A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching☆25Updated last month
- Models and examples built with hls4ml☆12Updated 4 years ago
- ☆32Updated 5 years ago
- Fast Emulation of Approximate DNN Accelerators in PyTorch☆14Updated 6 months ago
- SAMO: Streaming Architecture Mapping Optimisation☆31Updated 11 months ago
- A collection of tutorials for the fpgaConvNet framework.☆28Updated last month
- ☆23Updated 6 months ago
- The codes and artifacts associated with our MICRO'22 paper titled: "Adaptable Butterfly Accelerator for Attention-based NNs via Hardware …☆103Updated last year
- Low level design of a chip built for optimizing/accelerating CNN classifiers over gray scale images.☆12Updated 5 years ago
- Provides the hardware code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerator…☆23Updated 4 years ago
- Performance and resource models for fpgaConvNet: a Streaming-Architecture-based CNN Accelerator.☆25Updated 3 months ago
- An LLVM pass that can generate CDFG and map the target loops onto a parameterizable CGRA.☆53Updated this week
- Linux docker for the DNN accelerator exploration infrastructure composed of Accelergy and Timeloop☆41Updated 3 months ago
- ☆28Updated 2 weeks ago
- MICRO22 artifact evaluation for Sparseloop☆34Updated 2 years ago
- ☆17Updated last year
- A Spatial Accelerator Generation Framework for Tensor Algebra.☆48Updated 2 years ago
- AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP (Full Paper a…☆19Updated last month
- High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS☆77Updated last month
- ☆47Updated 8 months ago