HLSTransform / submissionLinks

☆98

Alternatives and similar repositories for submission

Users that are interested in submission are comparing it to the libraries listed below

Sorting:

DeepWok / mase
Machine-Learning Accelerator System Exploration Tools
☆173Updated 2 months ago
aliemo / transfomers-silicon-research
Research and Materials on Hardware implementation of Transformer Model
☆273Updated 5 months ago
hguq / HG-PIPE
FPGA-based hardware accelerator for Vision Transformer (ViT), with Hybrid-Grained Pipeline.
☆80Updated 6 months ago
jha-lab / acceltran
[TCAD'23] AccelTran: A Sparsity-Aware Accelerator for Transformers
☆51Updated last year
arc-research-lab / CHARM
CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture
☆148Updated this week
KULeuven-MICAS / zigzag
HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators
☆154Updated last week
makslevental / openhls
PyTorch model to RTL flow for low latency inference
☆130Updated last year
SingularityKChen / dl_accelerator
Deep Learning Accelerator Based on Eyeriss V2 Architecture with custom RISC-V extended instructions
☆198Updated 5 years ago
kachris / survey_HA_LLM
A survey on Hardware Accelerated LLMs
☆59Updated 6 months ago
KULeuven-MICAS / stream
Multi-core HW accelerator mapping optimization framework for layer-fused ML workloads.
☆57Updated last month
pulp-platform / ITA
☆47Updated 3 months ago
KastnerRG / cgra4ml
An Open Workflow to Build Custom SoCs and run Deep Models at the Edge
☆87Updated 2 months ago
sharc-lab / Edge-MoE
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts
☆124Updated last year
embedeep / FREE-TPU-V3plus-for-FPGA
FREE TPU V3plus for FPGA is the free version of a commercial AI processor (EEP-TPU) for Deep Learning EDGE Inference
☆152Updated 2 years ago
FlightLLM / flightllm_test_demo
☆26Updated last year
arc-research-lab / SSR
SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration (Full Paper Accepted in FPGA'24)
☆32Updated this week
gnodipac886 / ViT-FPGA-TPU
FPGA based Vision Transformer accelerator (Harvard CS205)
☆123Updated 5 months ago
cjg91 / trans-fat
An FPGA Accelerator for Transformer Inference
☆88Updated 3 years ago
actlab-genesys / GeneSys
An open-source parameterizable NPU generator with full-stack multi-target compilation stack for intelligent workloads.
☆60Updated 4 months ago
debtanu09 / systolic_array_matrix_multiplier
This is a verilog implementation of 4x4 systolic array multiplier
☆58Updated 4 years ago
linghaosong / Sextans
An FPGA accelerator for general-purpose Sparse-Matrix Dense-Matrix Multiplication (SpMM).
☆81Updated last year
cornell-zhang / HiSparse
High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS
☆94Updated 10 months ago
mit-han-lab / spatten
[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
☆100Updated 11 months ago
georgia-tech-synergy-lab / SIGMA
RTL implementation of Flex-DPE.
☆108Updated 5 years ago
PSAL-POSTECH / ONNXim
ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
☆135Updated 5 months ago
leo47007 / TPU-Tensor-Processing-Unit
IC implementation of TPU
☆128Updated 5 years ago
xliu0709 / WinoCNN
An HLS based winograd systolic CNN accelerator
☆53Updated 4 years ago
MartaAndronic / NeuraLUT
NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
☆37Updated 4 months ago
tancheng / CGRA-Flow
CGRA-Flow is an integrated framework for CGRA compilation, exploration, synthesis, and development.
☆135Updated last month
UCLA-VAST / AutoSA
AutoSA: Polyhedral-Based Systolic Array Compiler
☆221Updated 2 years ago