MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine (accepted as full paper at FPT'23)
☆22Apr 17, 2024Updated last year
Alternatives and similar repositories for MaxEVA
Users that are interested in MaxEVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture☆168Mar 12, 2026Updated last week
- An MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE☆17Aug 5, 2022Updated 3 years ago
- Xilinx Modifications to Halide☆13May 3, 2021Updated 4 years ago
- Hands-on experience programming AI Engines using Vitis Unified Software Platform☆40Jul 24, 2024Updated last year
- Generate versal system design from ONNX model. AI engine kernels. Sub-microsecond speeds for autoencoders.☆17Dec 29, 2024Updated last year
- ☆10Jun 4, 2024Updated last year
- ☆11Sep 3, 2022Updated 3 years ago
- SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)☆36Mar 12, 2026Updated last week
- Python functions and scripts to analyse cyclostationary signals☆26Feb 14, 2023Updated 3 years ago
- Train and deploy LUT-based neural networks on FPGAs☆107Jun 12, 2024Updated last year
- The VD100 development board is based on the Xilinx Versal AI Edge series chip xcve2302 and is designed with a core board and a bottom boa…☆18Jul 9, 2024Updated last year
- ☆30Apr 26, 2019Updated 6 years ago
- An MLIR-based toolchain for AMD AI Engine-enabled devices.☆605Updated this week
- [DATE 2025] Official implementation and dataset of AIrchitect v2: Learning the Hardware Accelerator Design Space through Unified Represen…☆19Jan 17, 2025Updated last year
- ☆32Mar 31, 2025Updated 11 months ago
- Generates intermediate tensor outputs for tflite☆15Apr 12, 2019Updated 6 years ago
- Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts☆134May 10, 2024Updated last year
- ☆128Updated this week
- TQT's pytorch implementation.☆21Dec 17, 2021Updated 4 years ago
- [TCAD 2021] Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA☆17Jul 7, 2022Updated 3 years ago
- McPAT modeling framework☆12Oct 18, 2014Updated 11 years ago
- Fork of LLVM to support AMD AIEngine processors☆190Updated this week
- ☆10Jan 25, 2023Updated 3 years ago
- An alternative Vivado custom design example (to fully Vitis) for the User Logic Partition targeting VCK5000☆13Jul 16, 2024Updated last year
- A tool to generate optimized hardware files for univariate functions.☆29Apr 5, 2024Updated last year
- FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations☆98Oct 2, 2021Updated 4 years ago
- RISC-V Rocket Chip Strap-on-Booster with Fused Universal Neural Network (FuNN) eNNgine☆21Mar 17, 2022Updated 4 years ago
- [ICML 2021] "Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators" by Yonggan Fu, Yonga…☆16Jan 3, 2022Updated 4 years ago
- Code to accompany "Weightless Neural Networks for Efficient Edge Inference", PACT 2022☆22Nov 15, 2022Updated 3 years ago
- Adaptive floating-point based numerical format for resilient deep learning☆14Apr 11, 2022Updated 3 years ago
- ☆14Mar 3, 2025Updated last year
- ☆17Feb 13, 2021Updated 5 years ago
- DeiT implementation for Q-ViT☆25Apr 21, 2025Updated 11 months ago
- Allo Accelerator Design and Programming Framework (PLDI'24)☆361Mar 13, 2026Updated last week
- The project is a simple example about how to use TensorFlow to train a ConNet model from labeled dataset and then use Vitis AI tools to d…☆15Aug 15, 2020Updated 5 years ago
- HLS Custom-Precision Floating-Point Library☆13Nov 6, 2017Updated 8 years ago
- [HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design☆130Jun 27, 2023Updated 2 years ago
- This repository contains the hardware implementation for Static BFP convolution on FPGA☆10Oct 15, 2019Updated 6 years ago
- BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization (ICLR 2021)☆42Jan 12, 2021Updated 5 years ago