snu-comparch / TenderLinks

Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)

☆21

Alternatives and similar repositories for Tender

Users that are interested in Tender are comparing it to the libraries listed below

Sorting:

jeffreyyu0602 / quantized-training
☆32Updated this week
abdelfattah-lab / BitMoD-HPCA-25
☆52Updated 3 months ago
clevercool / ANT-Quantization
☆111Updated last year
mit-han-lab / spatten
[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
☆112Updated last year
leesou / H2-LLM-ISCA-2025
H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference
☆72Updated 6 months ago
upmem / upmem_llm_framework
UPMEM LLM Framework allows profiling PyTorch layers and functions and simulate those layers/functions with a given hardware profile.
☆36Updated 2 months ago
PrincetonUniversity / LLMCompass
☆196Updated this week
sjtu-zhao-lab / SALO
An efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences
☆29Updated last year
ranggihwang / Pregated_MoE
☆55Updated last year
scale-snu / attacc_simulator
☆97Updated last year
Accelergy-Project / micro22-sparseloop-artifact
MICRO22 artifact evaluation for Sparseloop
☆44Updated 3 years ago
casys-kaist / NeuPIMs
NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing
☆95Updated last year
hatsu3 / Sanger
☆48Updated 4 years ago
pku-liang / Sanger
A co-design architecture on sparse attention
☆53Updated 4 years ago
Zhaoshixin-sky / CIM-MLC
[ASPLOS 2024] CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory Accelerators
☆47Updated last year
ChengZhang-98 / llm-mixed-q
Official implementation of EMNLP'23 paper "Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?"
☆23Updated 2 years ago
leesou / PIM-DL-ASPLOS
PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization
☆33Updated last year
scalesim-project / scale-sim-v3
☆48Updated 2 months ago
actlab-genesys / GeneSys
An open-source parameterizable NPU generator with full-stack multi-target compilation stack for intelligent workloads.
☆68Updated 3 weeks ago
isakedo / DNNsim
☆35Updated 5 years ago
GATECH-EIC / ViTALiTy
ViTALiTy (HPCA'23) Code Repository
☆23Updated 2 years ago
arkhadem / aim_simulator
A simulator for SK hynix AiM PIM architecture based on Ramulator 2.0
☆41Updated 3 months ago
GATECH-EIC / ViTCoD
[HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
☆117Updated 2 years ago
Zhu-Zixuan / Bitlet-PE
A bit-level sparsity-awared multiply-accumulate process element.
☆17Updated last year
SeoLabCornell / torch2chip
Torch2Chip (MLSys, 2024)
☆54Updated 6 months ago
SET-Scheduling-Project / GEMINI-HPCA2024
Open-source Framework for HPCA2024 paper: Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators
☆97Updated 5 months ago
ebby-s / MX-for-FPGA
Implementation of Microscaling data formats in SystemVerilog.
☆26Updated 3 months ago
ucb-bar / MoCA
☆28Updated 2 years ago
PSAL-POSTECH / ONNXim
ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
☆157Updated 8 months ago
UCLA-VAST / Serpens
Serpens is an HBM FPGA accelerator for SpMV
☆22Updated last year