kachris / survey_HA_LLMLinks

A survey on Hardware Accelerated LLMs

☆59

Alternatives and similar repositories for survey_HA_LLM

Users that are interested in survey_HA_LLM are comparing it to the libraries listed below

Sorting:

actlab-genesys / GeneSys
An open-source parameterizable NPU generator with full-stack multi-target compilation stack for intelligent workloads.
☆60Updated 4 months ago
DeepWok / mase
Machine-Learning Accelerator System Exploration Tools
☆173Updated 2 months ago
PSAL-POSTECH / ONNXim
ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
☆135Updated 5 months ago
KULeuven-MICAS / zigzag
HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators
☆155Updated 2 weeks ago
linghaosong / Sextans
An FPGA accelerator for general-purpose Sparse-Matrix Dense-Matrix Multiplication (SpMM).
☆81Updated last year
KULeuven-MICAS / stream
Multi-core HW accelerator mapping optimization framework for layer-fused ML workloads.
☆57Updated last month
UIUC-ChenLab / ScaleHLS-HIDA
☆56Updated 4 months ago
arc-research-lab / CHARM
CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture
☆148Updated this week
cornell-zhang / HiSparse
High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS
☆94Updated 10 months ago
mit-emze / cimloop
☆64Updated last month
maeri-project / FEATHER
A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching
☆56Updated 4 months ago
Accelergy-Project / accelergy-timeloop-infrastructure
Linux docker for the DNN accelerator exploration infrastructure composed of Accelergy and Timeloop
☆56Updated 3 months ago
PrincetonUniversity / muchiSim
Simulator framework for analysis of performance, energy consumption, area and cost of multi-node multi-chiplet tile-based manycore design…
☆68Updated last year
leesou / H2-LLM-ISCA-2025
H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference
☆45Updated 3 months ago
jha-lab / acceltran
[TCAD'23] AccelTran: A Sparsity-Aware Accelerator for Transformers
☆51Updated last year
georgia-tech-synergy-lab / SIGMA
RTL implementation of Flex-DPE.
☆108Updated 5 years ago
arc-research-lab / SSR
SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)
☆32Updated this week
casys-kaist / mNPUsim
mNPUsim: A Cycle-accurate Multi-core NPU Simulator (IISWC 2023)
☆60Updated 7 months ago
arkhadem / aim_simulator
A simulator for SK hynix AiM PIM architecture based on Ramulator 2.0
☆30Updated 2 weeks ago
diwu1990 / uSystolic-Sim
A systolic array simulator for multi-cycle MACs and varying-byte words, with the paper accepted to HPCA 2022.
☆80Updated 3 years ago
suchandler96 / gem5-NVDLA
☆33Updated 4 months ago
cornell-zhang / allo
Allo: A Programming Model for Composable Accelerator Design
☆255Updated this week
Accelergy-Project / micro22-sparseloop-artifact
MICRO22 artifact evaluation for Sparseloop
☆44Updated 3 years ago
harvard-acc / FlexASR
FlexASR: A Reconfigurable Hardware Accelerator for Attention-based Seq-to-Seq Networks
☆46Updated 5 months ago
cjg91 / trans-fat
An FPGA Accelerator for Transformer Inference
☆88Updated 3 years ago
cwfletcher / buffets
Implementations of Buffets, which are efficient, composable idioms for implementing Explicit Decoupled Data Orchestration.
☆77Updated 6 years ago
MartaAndronic / NeuraLUT
NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
☆37Updated 4 months ago
hngenc / systolic-array
A DSL for Systolic Arrays
☆80Updated 6 years ago
Accelergy-Project / timeloop-accelergy-exercises
Exercises for exploring the Fibertree, Timeloop and Accelergy tools
☆101Updated 4 months ago
kelvin0207 / SparSynergy
Open source RTL implementation of Tensor Core, Sparse Tensor Core, BitWave and SparSynergy in the article: "SparSynergy: Unlocking Flexib…
☆17Updated 4 months ago