SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)
☆35Mar 1, 2026Updated this week
Alternatives and similar repositories for SSR
Users that are interested in SSR are comparing it to the libraries listed below
Sorting:
- AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP (Full Paper a…☆25May 18, 2025Updated 9 months ago
- CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture☆164Updated this week
- MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine (accepted as full paper at FPT'23)☆21Apr 17, 2024Updated last year
- C++ code for HLS FPGA implementation of transformer☆20Sep 11, 2024Updated last year
- ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines (FPGA 2025 Best Paper Nominee)☆59Feb 24, 2026Updated last week
- [FPGA 2024] Source code and bitstream for LevelST: Stream-based Accelerator for Sparse Triangular Solver☆15Jun 1, 2025Updated 9 months ago
- SCARIF is a tool to estimate the embodied carbon emissions of data center servers with accelerator hardware (GPUs, FPGAs, etc.)☆15Updated this week
- Attentionlego☆13Jan 24, 2024Updated 2 years ago
- [DATE 2025] Official implementation and dataset of AIrchitect v2: Learning the Hardware Accelerator Design Space through Unified Represen…☆19Jan 17, 2025Updated last year
- ☆62Mar 24, 2025Updated 11 months ago
- ☆18Aug 9, 2025Updated 6 months ago
- You can run it on pynq z1. The repository contains the relevant Verilog code, Vivado configuration and C code for sdk testing. The size o…☆230Mar 24, 2024Updated last year
- FPGA based Vision Transformer accelerator (Harvard CS205)☆150Feb 11, 2025Updated last year
- Collection of kernel accelerators optimised for LLM execution☆27Updated this week
- An FPGA Accelerator for Transformer Inference☆93Apr 29, 2022Updated 3 years ago
- XRM (Xilinx FPGA Resource Manager) Document:☆25Nov 13, 2023Updated 2 years ago
- TAPA compiles task-parallel HLS program into high-performance FPGA accelerators. UCLA-maintained.☆182Aug 16, 2025Updated 6 months ago
- Accelerate multihead attention transformer model using HLS for FPGA☆11Dec 7, 2023Updated 2 years ago
- CNN simd based accelerator using Vitis HLS☆11Jul 15, 2022Updated 3 years ago
- An MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE☆17Aug 5, 2022Updated 3 years ago
- ☆13Mar 22, 2024Updated last year
- An efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences☆31Mar 7, 2024Updated last year
- This is a series of quick start guide of Vitis HLS tool in Chinese. It explains the basic concepts and the most important optimize techni…☆26Nov 9, 2022Updated 3 years ago
- ☆119Jan 11, 2024Updated 2 years ago
- Open-source AI acceleration on FPGA: from ONNX to RTL☆49Jan 5, 2026Updated last month
- ☆126Updated this week
- ☆18May 1, 2024Updated last year
- An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.☆14Feb 3, 2025Updated last year
- ☆32Mar 31, 2025Updated 11 months ago
- Allo Accelerator Design and Programming Framework (PLDI'24)☆352Feb 8, 2026Updated 3 weeks ago
- Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts☆133May 10, 2024Updated last year
- Xilinx Modifications to Halide☆13May 3, 2021Updated 4 years ago
- MEEP FPGA Shell project, currently supporting Alveos u280 and u55c☆14Mar 14, 2024Updated last year
- FPGA-based hardware accelerator for Vision Transformer (ViT), with Hybrid-Grained Pipeline.☆126Jan 20, 2025Updated last year
- ☆16Aug 29, 2024Updated last year
- ☆15Aug 10, 2023Updated 2 years ago
- FPGA implement of 8x8 weight stationary systolic array DNN accelerator☆17Feb 27, 2021Updated 5 years ago
- An alternative Vivado custom design example (to fully Vitis) for the User Logic Partition targeting VCK5000☆13Jul 16, 2024Updated last year
- Generate versal system design from ONNX model. AI engine kernels. Sub-microsecond speeds for autoencoders.☆16Dec 29, 2024Updated last year