GATECH-EIC/ShiftAddLLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/GATECH-EIC/ShiftAddLLM)

GATECH-EIC / ShiftAddLLM

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

☆114

Alternatives and similar repositories for ShiftAddLLM

Users that are interested in ShiftAddLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GATECH-EIC / torchshiftadd
View on GitHub
An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.
☆14Feb 3, 2025Updated last year
GATECH-EIC / AmoebaLLM
View on GitHub
[NeurIPS 2024] "AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment" by Yonggan Fu, Zhongzhi Yu,…
☆19Dec 13, 2024Updated last year
GATECH-EIC / ShiftAddViT
View on GitHub
[NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
☆30Dec 6, 2023Updated 2 years ago
GATECH-EIC / SuperTickets
View on GitHub
[ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning
☆20Jul 7, 2022Updated 4 years ago
GATECH-EIC / ShiftAddNAS
View on GitHub
[ICML 2022] ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
☆15May 18, 2022Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
GATECH-EIC / ViTALiTy
View on GitHub
ViTALiTy (HPCA'23) Code Repository
☆23Mar 13, 2023Updated 3 years ago
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated 2 years ago
GATECH-EIC / ViTCoD
View on GitHub
[HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
☆133Jun 27, 2023Updated 3 years ago
Aaronhuang-778 / SliM-LLM
View on GitHub
[ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
☆62Aug 9, 2024Updated last year
HanGuo97 / flute
View on GitHub
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
☆391Apr 13, 2025Updated last year
VITA-Group / Q-Hitter
View on GitHub
☆15Jun 4, 2024Updated 2 years ago
spcl / QuaRot
View on GitHub
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
☆523Nov 26, 2024Updated last year
ChenMnZ / PrefixQuant
View on GitHub
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
☆176Nov 26, 2025Updated 7 months ago
GATECH-EIC / ShiftAddNet
View on GitHub
[NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network
☆74Nov 16, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
georgia-tech-synergy-lab / CLAMP-ViT
View on GitHub
[ECCV 2024] CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
☆19Jul 2, 2024Updated 2 years ago
naver-aics / lut-gemm
View on GitHub
☆82Apr 1, 2024Updated 2 years ago
ArminAzizi98 / LaMDA
View on GitHub
☆15Nov 7, 2024Updated last year
ksouvik52 / hiresnn2021
View on GitHub
☆14May 13, 2022Updated 4 years ago
GATECH-EIC / FracTrain
View on GitHub
[NeurIPS 2020] "FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training" by Yonggan Fu, Ha…
☆10Feb 13, 2022Updated 4 years ago
maestro-project / AIrchitect-v2
View on GitHub
[DATE 2025] Official implementation and dataset of AIrchitect v2: Learning the Hardware Accelerator Design Space through Unified Represen…
☆20Jan 17, 2025Updated last year
epfml / pam
View on GitHub
☆16Dec 9, 2023Updated 2 years ago
jy-yuan / KIVI
View on GitHub
[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
☆418Nov 20, 2025Updated 8 months ago
enyac-group / Elana
View on GitHub
Elana: A Simple Energy & Latency Analyzer for LLMs
☆16Apr 3, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
enyac-group / UniQL
View on GitHub
UniQL official repository (ICLR 2026)
☆16Jan 27, 2026Updated 5 months ago
FLOW-open-project / FLOW
View on GitHub
Codebase for layer wise N:M pruning pattern assignment for LLMs
☆15Aug 5, 2025Updated 11 months ago
VITA-Group / R-Sparse
View on GitHub
[ICLR'25] R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
☆21Apr 28, 2025Updated last year
GATECH-EIC / Castling-ViT
View on GitHub
[CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference
☆31Mar 14, 2024Updated 2 years ago
aiming-lab / CITER
View on GitHub
[COLM'25] CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing
☆19Jun 25, 2025Updated last year
GATECH-EIC / Edge-LLM
View on GitHub
[DAC 2024] EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive La…
☆91Jun 30, 2024Updated 2 years ago
Infini-AI-Lab / TriForce
View on GitHub
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
☆281Aug 31, 2024Updated last year
SNU-ARC / any-precision-llm
View on GitHub
[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
☆130Jul 4, 2025Updated last year
jadohu / LANTERN
View on GitHub
Official Implementation of LANTERN (ICLR'25) and LANTERN++(ICLRW-SCOPE'25)
☆21Mar 5, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
GATECH-EIC / Double-Win-Quant
View on GitHub
[ICML 2021] "Double-Win Quant: Aggressively Winning Robustness of Quantized DeepNeural Networks via Random Precision Training and Inferen…
☆16Feb 13, 2022Updated 4 years ago
microsoft / T-MAC
View on GitHub
Low-bit LLM inference on CPU/NPU with lookup table
☆972Jun 5, 2025Updated last year
GATECH-EIC / GCoD
View on GitHub
[HPCA 2022] GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design
☆38Mar 30, 2022Updated 4 years ago
ChengZhang-98 / llm-mixed-q
View on GitHub
Official implementation of EMNLP'23 paper "Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?"
☆24Oct 25, 2023Updated 2 years ago
SqueezeAILab / SqueezeLLM
View on GitHub
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
☆722Aug 13, 2024Updated last year
kuleshov-group / MODULoRA-Experiment
View on GitHub
Evaluation Code repository for the paper "ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers". (2023…
☆13Dec 5, 2023Updated 2 years ago
nbasyl / LLM-FP4
View on GitHub
The official implementation of the EMNLP 2023 paper LLM-FP4
☆224Dec 15, 2023Updated 2 years ago