☆119Jan 11, 2024Updated 2 years ago
Alternatives and similar repositories for submission
Users that are interested in submission are comparing it to the libraries listed below
Sorting:
- Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts☆134May 10, 2024Updated last year
- An FPGA Accelerator for Transformer Inference☆93Apr 29, 2022Updated 3 years ago
- [DATE 2025] Official implementation and dataset of AIrchitect v2: Learning the Hardware Accelerator Design Space through Unified Represen…☆19Jan 17, 2025Updated last year
- SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)☆36Updated this week
- FPGA-based hardware accelerator for Vision Transformer (ViT), with Hybrid-Grained Pipeline.☆129Jan 20, 2025Updated last year
- [TRETS 2025][FPGA 2024] FPGA Accelerator for Imbalanced SpMV using HLS☆20Aug 24, 2025Updated 6 months ago
- ☆14Jun 22, 2022Updated 3 years ago
- Research and Materials on Hardware implementation of Transformer Model☆298Feb 28, 2025Updated last year
- Artifact evaluation of PLDI'24 paper "Allo: A Programming Model for Composable Accelerator Design"☆33Apr 11, 2024Updated last year
- You can run it on pynq z1. The repository contains the relevant Verilog code, Vivado configuration and C code for sdk testing. The size o…☆231Mar 24, 2024Updated last year
- This repository contains papers for a comprehensive survey on accelerated generation techniques in Large Language Models (LLMs).☆11May 24, 2024Updated last year
- Attentionlego☆13Jan 24, 2024Updated 2 years ago
- Allo Accelerator Design and Programming Framework (PLDI'24)☆353Updated this week
- An efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences☆31Mar 7, 2024Updated 2 years ago
- ☆15Aug 10, 2023Updated 2 years ago
- ☆14Jun 4, 2024Updated last year
- The official implementation of the DAC 2024 paper GQA-LUT☆20Dec 20, 2024Updated last year
- ☆16Apr 10, 2023Updated 2 years ago
- ☆46Apr 8, 2023Updated 2 years ago
- MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine (accepted as full paper at FPT'23)☆21Apr 17, 2024Updated last year
- Scalable systolic array-based matrix-matrix multiplication implemented in Vivado HLS for Xilinx FPGAs.☆376Jan 20, 2025Updated last year
- This is my hobby project with System Verilog to accelerate LeViT Network which contain CNN and Attention layer.☆33Aug 13, 2024Updated last year
- ☆62Mar 24, 2025Updated 11 months ago
- ViTALiTy (HPCA'23) Code Repository☆23Mar 13, 2023Updated 2 years ago
- ☆26Dec 12, 2022Updated 3 years ago
- ☆17Feb 13, 2021Updated 5 years ago
- A survey on Hardware Accelerated LLMs☆62Jan 13, 2025Updated last year
- Tracks cross references and allows fast viewing of pseudocode between references☆13Mar 10, 2025Updated 11 months ago
- Collection of kernel accelerators optimised for LLM execution☆27Feb 26, 2026Updated last week
- A graph linear algebra overlay☆52Apr 26, 2023Updated 2 years ago
- 基于Xilinx FPGA的通用型 CNN卷积神经网络加速器,本设计基于KV260板卡,MpSoC架构均可移植☆18Dec 13, 2024Updated last year
- [ISCA'25] LIA: A Single-GPU LLM Inference Acceleration with Cooperative AMX-Enabled CPU-GPU Computation and CXL Offloading☆13Jun 28, 2025Updated 8 months ago
- c++ version of ViT☆12Nov 13, 2022Updated 3 years ago
- JEDI-net: a jet identification algorithm based on interaction networks☆10Aug 16, 2020Updated 5 years ago
- Accelerate multihead attention transformer model using HLS for FPGA☆11Dec 7, 2023Updated 2 years ago
- This is a series of quick start guide of Vitis HLS tool in Chinese. It explains the basic concepts and the most important optimize techni…☆26Nov 9, 2022Updated 3 years ago
- ☆28Feb 26, 2023Updated 3 years ago
- HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond☆49Feb 24, 2026Updated last week
- System-on-a-Chip for FPGA, with xr16 RISC core and LCC port☆12Jul 23, 2017Updated 8 years ago