ChengZhang-98/LQER

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ChengZhang-98/LQER)

ChengZhang-98 / LQER

Official implementation of ICML'24 paper "LQER: Low-Rank Quantization Error Reconstruction for LLMs"

☆19

Alternatives and similar repositories for LQER

Users that are interested in LQER are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

BrotherHappy / OSTQuant
View on GitHub
[ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…
☆94Apr 8, 2025Updated last year
Xingyu-Zheng / FOEM
View on GitHub
(AAAI 2026) First-Order Error Matters: Accurate Compensation for Quantized Large Language Models
☆16Apr 16, 2026Updated 3 months ago
Qualcomm-AI-research / gptvq
View on GitHub
☆42Mar 28, 2024Updated 2 years ago
utkarsh-dmx / project-resq
View on GitHub
☆35Mar 28, 2025Updated last year
Intelligent-Computing-Lab-Panda / GPTAQ
View on GitHub
Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)
☆92Jul 28, 2025Updated 11 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
NVlabs / EoRA
View on GitHub
[ICLRW'26] EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation
☆49Apr 21, 2026Updated 3 months ago
A-suozhang / MixDQ
View on GitHub
[ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
☆14Nov 27, 2024Updated last year
yxli2123 / LoSparse
View on GitHub
☆64Oct 17, 2023Updated 2 years ago
Paramathic / slim
View on GitHub
SLiM: One-shot Quantized Sparse Plus Low-rank Approximation of LLMs (ICML 2025)
☆37Nov 28, 2025Updated 7 months ago
vsingh-group / FrameQuant
View on GitHub
☆11Nov 16, 2024Updated last year
spcl / spatial-collectives
View on GitHub
Optimized communication collectives for the Cerebras waferscale engine
☆17Jun 5, 2024Updated 2 years ago
wzhuang-xmu / LoSA
View on GitHub
[ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".
☆25Mar 16, 2025Updated last year
deep-optimization / SliderQuant
View on GitHub
The official project website of "SliderQuant: Accurate Post-Training Quantization for LLMs" (accepted to ICLR 2026).
☆24Jun 15, 2026Updated last month
PASSIONLab / MaskedSpGEMM
View on GitHub
☆10Jul 4, 2022Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
NYCU-EDgeAi / subspec
View on GitHub
[NeurIPS 2025] Speculate Deep and Accurate
☆23Jan 16, 2026Updated 6 months ago
Qualcomm-AI-research / lr-qat
View on GitHub
☆54Nov 5, 2024Updated last year
fmfi-compbio / admm-pruning
View on GitHub
☆30Jul 22, 2024Updated 2 years ago
SFFAI-AIKT / AIKT
View on GitHub
This is a open resource project for Artificial Intelligence
☆13Mar 3, 2019Updated 7 years ago
JialinMao / private_CNN
View on GitHub
☆10Jun 1, 2022Updated 4 years ago
jeffreyyu0602 / quantized-training
View on GitHub
☆35Dec 22, 2025Updated 7 months ago
hahnyuan / ASVD4LLM
View on GitHub
Activation-aware Singular Value Decomposition for Compressing Large Language Models
☆92Oct 22, 2024Updated last year
zhangsichengsjtu / AFPQ
View on GitHub
AFPQ code implementation
☆23Nov 6, 2023Updated 2 years ago
real-absolute-AI / RAPID
View on GitHub
[ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding
☆23Mar 2, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
JingyangXiang / DFRot
View on GitHub
[COLM 2025] DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation; 知乎：https://zhuanlan.zhihu.c…
☆30Mar 5, 2025Updated last year
Qualcomm-AI-research / llm-surgeon
View on GitHub
☆35May 24, 2024Updated 2 years ago
guoyang9 / PELA
View on GitHub
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation [CVPR 2024]
☆19Apr 14, 2024Updated 2 years ago
xvyaward / owq
View on GitHub
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…
☆72Mar 7, 2024Updated 2 years ago
cmd2001 / KVTuner
View on GitHub
[ICML2025] KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference
☆29Jan 27, 2026Updated 5 months ago
facebookresearch / SpinQuant
View on GitHub
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
☆417Feb 14, 2025Updated last year
ruikangliu / FlatQuant
View on GitHub
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
☆223Nov 25, 2025Updated 8 months ago
42Shawn / PTQ4DM
View on GitHub
Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)
☆146Apr 1, 2023Updated 3 years ago
luuyin / OWL
View on GitHub
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
☆82Jul 7, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
LOG-postech / rethinking-LLM-pruning
View on GitHub
[EMNLP 2024] Official implementation of "Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimiza…
☆28Feb 21, 2025Updated last year
yuxwind / CBS
View on GitHub
Official Code of The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks[ICML2022]
☆16Sep 20, 2022Updated 3 years ago
Aaronhuang-778 / SliM-LLM
View on GitHub
[ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
☆62Aug 9, 2024Updated last year
Chaos96 / fourierft
View on GitHub
☆154Sep 9, 2024Updated last year
Zhu-ZiXuan / Bitlet-PE
View on GitHub
A bit-level sparsity-awared multiply-accumulate process element.
☆19Jul 9, 2024Updated 2 years ago
merledu / magma-si
View on GitHub
Matrix Accelerator Generator for GeMM Operations based on SIGMA Architecture in CHISEL HDL
☆15Mar 21, 2024Updated 2 years ago
IST-DASLab / MicroAdam
View on GitHub
This repository contains code for the MicroAdam paper.
☆21Dec 14, 2024Updated last year