MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine (accepted as full paper at FPT'23)
☆22Apr 17, 2024Updated 2 years ago
Alternatives and similar repositories for MaxEVA
Users that are interested in MaxEVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture☆173Mar 12, 2026Updated 3 months ago
- An MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE☆17Aug 5, 2022Updated 3 years ago
- Hands-on experience programming AI Engines using Vitis Unified Software Platform☆42Jul 24, 2024Updated last year
- Xilinx Modifications to Halide☆13May 3, 2021Updated 5 years ago
- Generate versal system design from ONNX model. AI engine kernels. Sub-microsecond speeds for autoencoders.☆19Dec 29, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆12Jun 4, 2024Updated 2 years ago
- ☆11Sep 3, 2022Updated 3 years ago
- SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)☆36Mar 12, 2026Updated 3 months ago
- Python functions and scripts to analyse cyclostationary signals☆26Feb 14, 2023Updated 3 years ago
- Train and deploy LUT-based neural networks on FPGAs☆117Jun 12, 2024Updated 2 years ago
- The VD100 development board is based on the Xilinx Versal AI Edge series chip xcve2302 and is designed with a core board and a bottom boa…☆20Jul 9, 2024Updated last year
- ☆30Apr 26, 2019Updated 7 years ago
- An MLIR-based toolchain for AMD AI Engine-enabled devices.☆659Updated this week
- [DATE 2025] Official implementation and dataset of AIrchitect v2: Learning the Hardware Accelerator Design Space through Unified Represen…☆20Jan 17, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Generates intermediate tensor outputs for tflite☆15Apr 12, 2019Updated 7 years ago
- TQT's pytorch implementation.☆21Dec 17, 2021Updated 4 years ago
- ☆32Mar 31, 2025Updated last year
- Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts☆141May 10, 2024Updated 2 years ago
- [TCAD 2021] Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA☆17Jul 7, 2022Updated 3 years ago
- Hands-on experience using the Vitis unified software platform with Xilinx FPGA hardware☆49Jul 24, 2024Updated last year
- McPAT modeling framework☆13Oct 18, 2014Updated 11 years ago
- Fork of LLVM to support AMD AIEngine processors☆202Updated this week
- A tool to generate optimized hardware files for univariate functions.☆30Apr 5, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- RISC-V Rocket Chip Strap-on-Booster with Fused Universal Neural Network (FuNN) eNNgine☆21Mar 17, 2022Updated 4 years ago
- [ICML 2021] "Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators" by Yonggan Fu, Yonga…☆16Jan 3, 2022Updated 4 years ago
- Code to accompany "Weightless Neural Networks for Efficient Edge Inference", PACT 2022☆22Nov 15, 2022Updated 3 years ago
- ☆14Jun 22, 2026Updated last week
- Adaptive floating-point based numerical format for resilient deep learning☆14Apr 11, 2022Updated 4 years ago
- ☆18Feb 13, 2021Updated 5 years ago
- DeiT implementation for Q-ViT☆26Apr 21, 2025Updated last year
- Allo Accelerator Design and Programming Framework (PLDI'24)☆388Jun 19, 2026Updated 2 weeks ago
- An MLIR Complier for PyTorch/C/C++ Codes into HLS Dataflow Designs☆68Aug 1, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- HLS Custom-Precision Floating-Point Library☆13Nov 6, 2017Updated 8 years ago
- [HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design☆132Jun 27, 2023Updated 3 years ago
- This repository contains the hardware implementation for Static BFP convolution on FPGA☆10Oct 15, 2019Updated 6 years ago
- BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization (ICLR 2021)☆41Jan 12, 2021Updated 5 years ago
- OV7670 (Verilog HDL)Drive for FPGA☆19Mar 4, 2019Updated 7 years ago
- An FPGA accelerator for general-purpose Sparse-Matrix Dense-Matrix Multiplication (SpMM).☆95Jul 26, 2024Updated last year
- Generate an FPGA design for a TWN☆11Nov 4, 2019Updated 6 years ago