[ACM MM2025]: MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization
☆37Aug 13, 2025Updated 6 months ago
Alternatives and similar repositories for MQuant
Users that are interested in MQuant are comparing it to the libraries listed below
Sorting:
- Code repository for ICLR 2025 paper "LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid"☆25Mar 2, 2025Updated last year
- [NeurIPS'24]Efficient and accurate memory saving method towards W4A4 large multi-modal models.☆98Jan 3, 2025Updated last year
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆50Oct 21, 2023Updated 2 years ago
- [ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling☆22Nov 11, 2025Updated 3 months ago
- Pytorch implementation of our paper accepted by CVPR 2022 -- IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Sh…☆36Mar 2, 2022Updated 4 years ago
- [ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"☆210Nov 25, 2025Updated 3 months ago
- official implementation of paper SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training☆44Dec 11, 2024Updated last year
- Reproduction of 'Analysing Mathematical Reasoning Abilities of Neural Models' Saxton et. al. 2019☆12Dec 8, 2022Updated 3 years ago
- [ICML 2025] Fast and Low-Cost Genomic Foundation Models via Outlier Removal.☆17Jun 19, 2025Updated 8 months ago
- An efficient distillation method for flow matching models☆22Feb 1, 2026Updated last month
- A Low-Overhead tool for Floating-Point Exception Detection in NVIDIA GPUs☆12Dec 17, 2024Updated last year
- ☆12Aug 31, 2023Updated 2 years ago
- MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models☆28Feb 12, 2026Updated 3 weeks ago
- 🎓Automatically Update circult-eda-mlsys-tinyml Papers Daily using Github Actions (Update Every 8th hours)☆10Updated this week
- ☆13May 8, 2023Updated 2 years ago
- ☆25Nov 22, 2024Updated last year
- ☆14Oct 6, 2023Updated 2 years ago
- This repository provides the official implementation of QSVD, a method for efficient low-rank approximation that unifies Query-Key-Value …☆25Dec 1, 2025Updated 3 months ago
- ☆11Sep 30, 2023Updated 2 years ago
- KV cache compression via sparse coding☆17Oct 26, 2025Updated 4 months ago
- This procedure USES the model LSTM to train the data and predict the accusations☆10Jan 24, 2019Updated 7 years ago
- Python script who inject code in binary☆12Oct 30, 2020Updated 5 years ago
- ☆19Oct 2, 2024Updated last year
- [EMNLP 2024] Quantize LLM to extremely low-bit, and finetune the quantized LLMs☆15Jul 18, 2024Updated last year
- 用于解锁密码加密的pptx文件,使其可以编辑。☆13May 6, 2021Updated 4 years ago
- Static code injection using text padding and reverse text extension☆11Jun 7, 2017Updated 8 years ago
- LGPNet: Alleviating A Few Labeled Data and Large-Scale Network Dilemmas in Grasping Detection☆18Apr 17, 2022Updated 3 years ago
- Image Quality Assessment Paper Reading☆15Sep 11, 2022Updated 3 years ago
- ☆19Apr 3, 2025Updated 11 months ago
- ☆38Oct 11, 2025Updated 4 months ago
- Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".☆16Sep 15, 2024Updated last year
- ☆20Oct 13, 2024Updated last year
- ☆41Oct 15, 2025Updated 4 months ago
- ☆18Feb 4, 2025Updated last year
- ☆19May 11, 2021Updated 4 years ago
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆21Jan 29, 2025Updated last year
- 2018-2019 大三下学期计算机网络作业合集 (1711342 李纪)☆12Apr 26, 2020Updated 5 years ago
- [ICCV 2025] Task-Specific Zero-shot Quantization-Aware Training for Object Detection☆25Sep 26, 2025Updated 5 months ago
- TPDiff: Temporal Pyramid Video Diffusion Model☆25Mar 13, 2025Updated 11 months ago