thu-nics/MBQ

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/thu-nics/MBQ)

thu-nics / MBQ

The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"

☆93

Alternatives and similar repositories for MBQ

Users that are interested in MBQ are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

StiphyJay / MQuant
View on GitHub
[ACM MM2025]: MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization
☆44Aug 13, 2025Updated 11 months ago
ChangyuanWang17 / QVLM
View on GitHub
[NeurIPS'24]Efficient and accurate memory saving method towards W4A4 large multi-modal models.
☆103Jan 3, 2025Updated last year
ucas-xiang / QIG
View on GitHub
[CVPR 2026] Fine-Grained Post-Training Quantization for Large Vision Language Models with Quantization-Aware Integrated Gradients
☆23Jun 21, 2026Updated last month
Xingyu-Zheng / FOEM
View on GitHub
(AAAI 2026) First-Order Error Matters: Accurate Compensation for Quantized Large Language Models
☆16Apr 16, 2026Updated 3 months ago
Intelligent-Computing-Lab-Panda / GPTAQ
View on GitHub
Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)
☆92Jul 28, 2025Updated 11 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
SAI-Lab-NYU / QSVD
View on GitHub
This repository provides the official implementation of QSVD, a method for efficient low-rank approximation that unifies Query-Key-Value …
☆28May 16, 2026Updated 2 months ago
thu-nics / qllm-eval
View on GitHub
Code Repository of Evaluating Quantized Large Language Models
☆135Sep 8, 2024Updated last year
thu-nics / PM-KVQ
View on GitHub
The official code implementation for paper "PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs"
☆29May 24, 2025Updated last year
thu-nics / FrameFusion
View on GitHub
[ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models"
☆76Jan 13, 2026Updated 6 months ago
ChenMnZ / PrefixQuant
View on GitHub
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
☆176Nov 26, 2025Updated 7 months ago
imagination-research / LCSC
View on GitHub
[ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
☆16Feb 15, 2025Updated last year
antgroup / SPEED-Q
View on GitHub
☆27Jan 20, 2026Updated 6 months ago
alibaba / EfficientAI
View on GitHub
☆48May 9, 2026Updated 2 months ago
ModelTC / Outlier_Suppression_Plus
View on GitHub
Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…
☆52Oct 21, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
JingyangXiang / DFRot
View on GitHub
[COLM 2025] DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation; 知乎：https://zhuanlan.zhihu.c…
☆30Mar 5, 2025Updated last year
Kai-Liu001 / Awesome-Model-Quantization
View on GitHub
This repository contains low-bit quantization papers from 2020 to 2026 on top conference.
☆197Jun 25, 2026Updated last month
thu-nics / MixDQ
View on GitHub
[ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
☆50Nov 27, 2024Updated last year
HuangOwen / QAT-ACS
View on GitHub
[TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"
☆39Aug 20, 2024Updated last year
Aaronhuang-778 / SliM-LLM
View on GitHub
[ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
☆62Aug 9, 2024Updated last year
BrotherHappy / OSTQuant
View on GitHub
[ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…
☆94Apr 8, 2025Updated last year
ruikangliu / FlatQuant
View on GitHub
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
☆223Nov 25, 2025Updated 8 months ago
zhangsichengsjtu / AFPQ
View on GitHub
AFPQ code implementation
☆23Nov 6, 2023Updated 2 years ago
wangqinsi1 / Dobi-SVD
View on GitHub
[ICLR 2025] Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"
☆54Oct 19, 2025Updated 9 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
ShiheWang / FIMA-Q
View on GitHub
[CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation
☆29Jun 16, 2025Updated last year
chenzx921020 / MoEQuant
View on GitHub
☆17Apr 7, 2025Updated last year
thu-nics / ViDiT-Q
View on GitHub
[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
☆164Mar 21, 2025Updated last year
Hsu1023 / DuQuant
View on GitHub
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
☆186Apr 24, 2026Updated 3 months ago
insuhan / calibquant
View on GitHub
☆21Apr 3, 2025Updated last year
GoatWu / APHQ-ViT
View on GitHub
[CVPR 2025] APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
☆44Apr 7, 2025Updated last year
A-suozhang / MixDQ
View on GitHub
[ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
☆14Nov 27, 2024Updated last year
fuvty / DeSCo
View on GitHub
[WSDM'24 Oral] The official implementation of paper <DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting>
☆24Mar 11, 2024Updated 2 years ago
charbel-sakr / Fixed-Point-Training
View on GitHub
Code needed to reproduce results from my ICLR 2019 paper on fixed-point quantization of the backprop algorithm.
☆10Jan 24, 2019Updated 7 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
thu-nics / CLAP-triangle-counting
View on GitHub
[DATE'23] The official code for paper <CLAP: Locality Aware and Parallel Triangle Counting with Content Addressable Memory>
☆24May 25, 2026Updated last month
AutoLab-SAI-SJTU / QVLA
View on GitHub
[ICLR'26]QVLA
☆46Feb 4, 2026Updated 5 months ago
ModelTC / LightCompress
View on GitHub
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
☆735May 14, 2026Updated 2 months ago
iLearn-Lab / ACL25-PTQ1.61
View on GitHub
☆15Apr 6, 2026Updated 3 months ago
OpenGVLab / EfficientQAT
View on GitHub
[ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
☆342Apr 10, 2026Updated 3 months ago
xjjxmu / QSLAW
View on GitHub
The official code for "Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation" | [MM2…
☆14Dec 7, 2024Updated last year
facebookresearch / ParetoQ
View on GitHub
This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"
☆131Oct 15, 2025Updated 9 months ago