kachris / survey_HA_LLM
A survey on Hardware Accelerated LLMs
☆34Updated 10 months ago
Related projects ⓘ
Alternatives and complementary repositories for survey_HA_LLM
- ☆45Updated 2 months ago
- ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference☆67Updated last week
- Hands-on experience programming AI Engines using Vitis Unified Software Platform☆37Updated 3 months ago
- A fast, accurate trace-based simulator for High-Level Synthesis.☆36Updated 6 months ago
- High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS☆80Updated last month
- AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP (Full Paper a…☆21Updated last month
- FlexASR: A Reconfigurable Hardware Accelerator for Attention-based Seq-to-Seq Networks☆42Updated 2 years ago
- CHARM: Composing Heterogeneous Accelerators on Versal ACAP Architecture☆124Updated 2 weeks ago
- A DSL for Systolic Arrays☆78Updated 5 years ago
- Alveo Collective Communication Library: MPI-like communication operations for Xilinx Alveo accelerators☆81Updated last month
- A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching☆30Updated last month
- ACM TODAES Best Paper Award, 2022☆24Updated last year
- Processing in Memory Emulation☆18Updated last year
- Multi-core HW accelerator mapping optimization framework for layer-fused ML workloads.☆40Updated 2 weeks ago
- The Riallto Open Source Project from AMD☆68Updated last week
- ☆26Updated 3 years ago
- ☆30Updated 2 months ago
- CGRA-Flow is an integrated framework for CGRA compilation, exploration, synthesis, and development.☆113Updated last week
- An open-source parameterizable NPU generator with full-stack multi-target compilation stack for intelligent workloads.☆31Updated 7 months ago
- SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)☆26Updated 4 months ago
- An FPGA accelerator for general-purpose Sparse-Matrix Dense-Matrix Multiplication (SpMM).☆71Updated 3 months ago
- Allo: A Programming Model for Composable Accelerator Design☆146Updated this week
- A toolchain for rapid design space exploration of chiplet architectures☆33Updated 6 months ago
- An MLIR dialect to enable the efficient acceleration of ML model on CGRAs.☆53Updated last month
- ☆83Updated 5 months ago
- Fork of seldridge/rocket-rocc-examples with tests for a systolic array based matmul accelerator☆52Updated 2 weeks ago
- An Open Workflow to Build Custom SoCs and run Deep Models at the Edge☆65Updated 3 months ago
- CGRA framework with vectorization support.☆19Updated this week
- ☆21Updated last month
- Machine-Learning Accelerator System Exploration Tools☆124Updated this week