SamsungLabs / PMPDLinks

Codebase for the Progressive Mixed-Precision Decoding paper.

☆18

Alternatives and similar repositories for PMPD

Users that are interested in PMPD are comparing it to the libraries listed below

Sorting:

clevercool / ANT-Quantization
☆113Updated 2 years ago
ChengZhang-98 / llm-mixed-q
Official implementation of EMNLP'23 paper "Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?"
☆24Updated 2 years ago
snu-comparch / Tender
Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)
☆23Updated last year
hsharma35 / bitfusion
Simulator for BitFusion
☆102Updated 5 years ago
yanghr / BSQ
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization (ICLR 2021)
☆42Updated 4 years ago
jeffreyyu0602 / quantized-training
☆32Updated last week
sfox14 / block_minifloat
Training with Block Minifloat number representation
☆17Updated 4 years ago
georgia-tech-synergy-lab / CLAMP-ViT
[ECCV 2024] CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
☆15Updated last year
isakedo / DNNsim
☆35Updated 5 years ago
naver-aics / lut-gemm
☆80Updated last year
CLab-HKUST-GZ / micro58-axcore
☆26Updated last month
VITA-Group / Q-Hitter
☆15Updated last year
ebby-s / MX-for-FPGA
Implementation of Microscaling data formats in SystemVerilog.
☆28Updated 5 months ago
wimh966 / outlier_suppression
The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer L…
☆49Updated 3 years ago
PannenetsF / TQT
TQT's pytorch implementation.
☆21Updated 3 years ago
parsa-epfl / quantization-sparsity-interplay
This repo contains the code for studying the interplay between quantization and sparsity methods
☆24Updated 9 months ago
mit-han-lab / spatten
[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
☆114Updated last year
ttambe / AdaptivFloat
Adaptive floating-point based numerical format for resilient deep learning
☆14Updated 3 years ago
PrincetonUniversity / LLMCompass
☆209Updated last month
GATECH-EIC / DNN-Chip-Predictor
[ICASSP'20] DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architecture…
☆25Updated 3 years ago
aojunzz / DominoSearch
☆19Updated 3 years ago
jha-lab / acceltran
[TCAD'23] AccelTran: A Sparsity-Aware Accelerator for Transformers
☆54Updated 2 years ago
SeoLabCornell / torch2chip
Torch2Chip (MLSys, 2024)
☆54Updated 8 months ago
GATECH-EIC / ViTCoD
[HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
☆123Updated 2 years ago
sharc-lab / Edge-MoE
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts
☆129Updated last year
Qualcomm-AI-research / oscillations-qat
☆78Updated 3 years ago
hqjenny / CoDeNet
☆19Updated 4 years ago
ma3mool / goldeneye
GoldenEye is a functional simulator with fault injection capabilities for common and emerging numerical formats, implemented for the PyTo…
☆26Updated last year
zhexinli / Q-ViT-DeiT
DeiT implementation for Q-ViT
☆25Updated 7 months ago
wangmaolin / niti
Implementation of "NITI: Training Integer Neural Networks Using Integer-only Arithmetic" on arxiv
☆86Updated 3 years ago