Intelligent-Computing-Lab-Panda/TesseraQ

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Intelligent-Computing-Lab-Panda/TesseraQ)

Intelligent-Computing-Lab-Panda / TesseraQ

☆25

Alternatives and similar repositories for TesseraQ

Users that are interested in TesseraQ are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ChenMnZ / PrefixQuant
View on GitHub
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
☆176Nov 26, 2025Updated 7 months ago
HuangOwen / RoLoRA
View on GitHub
[EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
☆41Sep 24, 2024Updated last year
thu-nics / PM-KVQ
View on GitHub
The official code implementation for paper "PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs"
☆29May 24, 2025Updated last year
1157942086 / CVPR2020_Auxiliary_Quantization
View on GitHub
Training Quantized Neural Networks with a Full-precision Auxiliary Module
☆13Jun 19, 2020Updated 6 years ago
utkarsh-dmx / project-resq
View on GitHub
☆35Mar 28, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
linfeng93 / BiTA
View on GitHub
An innovative method expediting LLMs via streamlined semi-autoregressive generation and draft verification.
☆29Apr 15, 2025Updated last year
iamkanghyunchoi / ait
View on GitHub
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher [CVPR 2022 Oral]
☆29Sep 15, 2022Updated 3 years ago
FFTYYY / RaanA
View on GitHub
Implementation of "RaanA: A Fast, Flexible, and Data-Efficient Post-Training Quantization Algorithm"
☆17Apr 11, 2025Updated last year
Intelligent-Computing-Lab-Panda / GPTAQ
View on GitHub
Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)
☆92Jul 28, 2025Updated 11 months ago
xvyaward / owq
View on GitHub
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…
☆72Mar 7, 2024Updated 2 years ago
ModelTC / LPCV2021_Winner_Solution
View on GitHub
☆28Nov 5, 2021Updated 4 years ago
WeixiangXu / STTN
View on GitHub
☆17Oct 25, 2022Updated 3 years ago
ruikangliu / FlatQuant
View on GitHub
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
☆223Nov 25, 2025Updated 8 months ago
cmd2001 / KVTuner
View on GitHub
[ICML2025] KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference
☆29Jan 27, 2026Updated 5 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
iankur / vqllm
View on GitHub
Residual vector quantization for KV cache compression in large language model
☆12Oct 22, 2024Updated last year
HandH1998 / QQQ
View on GitHub
QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.
☆158Aug 21, 2025Updated 11 months ago
zhangsichengsjtu / AFPQ
View on GitHub
AFPQ code implementation
☆23Nov 6, 2023Updated 2 years ago
ylsung / rsq
View on GitHub
Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"
☆23Mar 25, 2026Updated 4 months ago
bytedance / AffineQuant
View on GitHub
Official implementation of the ICLR 2024 paper AffineQuant
☆30Mar 30, 2024Updated 2 years ago
Twilight92z / Quantize-Watermark
View on GitHub
☆19Nov 6, 2023Updated 2 years ago
aiha-lab / MX-QLLM
View on GitHub
LLM Inference with Microscaling Format
☆35Nov 12, 2024Updated last year
epfml / pam
View on GitHub
☆16Dec 9, 2023Updated 2 years ago
hfutqian / AdaDFQ
View on GitHub
☆22Oct 27, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Qualcomm-AI-research / lr-qat
View on GitHub
☆54Nov 5, 2024Updated last year
ByteDance-Seed / decoupleQ
View on GitHub
A quantization algorithm for LLM
☆150Jun 21, 2024Updated 2 years ago
Anonymous1252022 / fp4-all-the-way
View on GitHub
☆52May 20, 2025Updated last year
ScalingIntelligence / CATS
View on GitHub
☆33Nov 11, 2024Updated last year
ruikangliu / Quantized-Reasoning-Models
View on GitHub
[COLM 2025] Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"
☆77Jul 8, 2025Updated last year
htqin / DSG
View on GitHub
This project is the official implementation of our accepted IEEE TPAMI paper Diverse Sample Generation: Pushing the Limit of Data-free Qu…
☆15Feb 26, 2023Updated 3 years ago
ModelTC / AAAI2023_EAMPD
View on GitHub
AAAI2023 Efficient and Accurate Models towards Practical Deep Learning Baseline
☆13Nov 29, 2022Updated 3 years ago
zqOuO / GWT
View on GitHub
☆13May 4, 2026Updated 2 months ago
IST-DASLab / HALO
View on GitHub
HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arx…
☆31Feb 17, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
iLearn-Lab / ACL25-PTQ1.61
View on GitHub
☆15Apr 6, 2026Updated 3 months ago
MAC-AutoML / OMPQ
View on GitHub
☆25Dec 11, 2021Updated 4 years ago
fmfi-compbio / admm-pruning
View on GitHub
☆30Jul 22, 2024Updated 2 years ago
xjjxmu / QSLAW
View on GitHub
The official code for "Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation" | [MM2…
☆14Dec 7, 2024Updated last year
kyrie-23 / linear_task_arithmetic
View on GitHub
☆12Jul 30, 2025Updated 11 months ago
ArminAzizi98 / LaMDA
View on GitHub
☆15Nov 7, 2024Updated last year
facebookresearch / SpinQuant
View on GitHub
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
☆417Feb 14, 2025Updated last year