VCA-EPFL / FSALinks
FSA: Fusing FlashAttention within a Single Systolic Array
☆32Updated this week
Alternatives and similar repositories for FSA
Users that are interested in FSA are comparing it to the libraries listed below
Sorting:
- Cluster-level matrix unit integration into GPUs, implemented in Chipyard SoC☆36Updated last month
- High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS☆94Updated 10 months ago
- FlexASR: A Reconfigurable Hardware Accelerator for Attention-based Seq-to-Seq Networks☆46Updated 5 months ago
- A DSL for Systolic Arrays☆80Updated 6 years ago
- Multi-core HW accelerator mapping optimization framework for layer-fused ML workloads.