HuangOwen / RoLoRA
View external linksLinks

[EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization

☆37

Alternatives and similar repositories for RoLoRA

Users that are interested in RoLoRA are comparing it to the libraries listed below

Sorting:

PingchengDong / GQA-LUT
View on GitHub
The official implementation of the DAC 2024 paper GQA-LUT
☆20Dec 20, 2024Updated last year
Intelligent-Computing-Lab-Panda / TesseraQ
View on GitHub
☆25Oct 31, 2024Updated last year
htqin / IR-QLoRA
View on GitHub
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…
☆67Apr 15, 2024Updated last year
z-lab / sparselora
View on GitHub
[ICML 2025] SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
☆71Jul 5, 2025Updated 7 months ago
zhangsichengsjtu / AFPQ
View on GitHub
AFPQ code implementation
☆23Nov 6, 2023Updated 2 years ago
ChenMnZ / PrefixQuant
View on GitHub
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
☆172Nov 26, 2025Updated 2 months ago
cornell-zhang / llm-datatypes
View on GitHub
Codebase for ICML'24 paper: Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
☆27Jun 25, 2024Updated last year
Intelligent-Computing-Lab-Panda / GPTAQ
View on GitHub
Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)
☆81Jul 28, 2025Updated 6 months ago
HuangOwen / Quantization-Variation
View on GitHub
[TMLR] Official PyTorch implementation of paper "Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precisio…
☆48Sep 27, 2024Updated last year
facebookresearch / SpinQuant
View on GitHub
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
☆372Feb 14, 2025Updated last year
Xingyu-Zheng / BiDM
View on GitHub
(NeurIPS 2024) BiDM: Pushing the Limit of Quantization for Diffusion Models
☆22Nov 20, 2024Updated last year
nbasyl / LLM-FP4
View on GitHub
The official implementation of the EMNLP 2023 paper LLM-FP4
☆220Dec 15, 2023Updated 2 years ago
yxli2123 / LoftQ
View on GitHub
☆235Jun 11, 2024Updated last year
JingyangXiang / DFRot
View on GitHub
[COLM 2025] DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation; 知乎：https://zhuanlan.zhihu.c…
☆29Mar 5, 2025Updated 11 months ago
IST-DASLab / QuEST
View on GitHub
Work in progress.
☆79Nov 25, 2025Updated 2 months ago
Xingyu-Zheng / BinaryDM
View on GitHub
(ICLR 2025) BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models
☆26Oct 4, 2024Updated last year
ruikangliu / FlatQuant
View on GitHub
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
☆211Nov 25, 2025Updated 2 months ago
Aaronhuang-778 / SliM-LLM
View on GitHub
[ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
☆51Aug 9, 2024Updated last year
INV-WZQ / SparseD
View on GitHub
[ICLR 2026] SparseD: Sparse Attention for Diffusion Language Models
☆57Oct 7, 2025Updated 4 months ago
iankur / vqllm
View on GitHub
Residual vector quantization for KV cache compression in large language model
☆11Oct 22, 2024Updated last year
zjq0455 / PTQ1.61
View on GitHub
☆15Jan 12, 2026Updated last month
AozhongZhang / MagR
View on GitHub
☆13Jun 22, 2025Updated 7 months ago
wangitu / CherryQ
View on GitHub
☆14May 21, 2024Updated last year
YouAreSpecialToMe / QST
View on GitHub
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
☆49Nov 5, 2024Updated last year
XIANGLONGYAN / PBS2P
View on GitHub
PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"
☆13Sep 28, 2025Updated 4 months ago
ruikangliu / Quantized-Reasoning-Models
View on GitHub
[COLM 2025] Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"
☆67Jul 8, 2025Updated 7 months ago
mengxiayu / LLMSuperWeight
View on GitHub
Code for studying the super weight in LLM
☆121Dec 3, 2024Updated last year
gccnlp / Light-PEFT
View on GitHub
[ACL 2024 Findings] Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning
☆13Sep 2, 2024Updated last year
Ther-nullptr / circult-eda-mlsys-tinyml-arxiv-daily
View on GitHub
🎓Automatically Update circult-eda-mlsys-tinyml Papers Daily using Github Actions (Update Every 8th hours)
☆10Updated this week
ysngki / UMoE
View on GitHub
☆20Oct 22, 2025Updated 3 months ago
snu-mllab / GuidedQuant
View on GitHub
Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)
☆50Jul 6, 2025Updated 7 months ago
HuangOwen / QAT-ACS
View on GitHub
[TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"
☆37Aug 20, 2024Updated last year
DravenALG / ReSTE
View on GitHub
(ICCV 2023) Official implementation of Rectified Straight Through Estimator (ReSTE).
☆31Sep 20, 2024Updated last year
bytedance / AffineQuant
View on GitHub
Official implementation of the ICLR 2024 paper AffineQuant
☆28Mar 30, 2024Updated last year
GATECH-EIC / Linearized-LLM
View on GitHub
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Jun 12, 2024Updated last year
IST-DASLab / HALO
View on GitHub
HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arx…
☆29Feb 17, 2025Updated 11 months ago
krafton-ai / lexico
View on GitHub
KV cache compression via sparse coding
☆17Oct 26, 2025Updated 3 months ago
MetaLearners / Solution-to-CVPR2021-NAS-competition-Track-1
View on GitHub
☆13Jun 28, 2021Updated 4 years ago
ArminAzizi98 / LaMDA
View on GitHub
☆15Nov 7, 2024Updated last year

HuangOwen / RoLoRAView external linksLinks

Alternatives and similar repositories for RoLoRA

HuangOwen / RoLoRA
View external linksLinks