Code repository for ICLR 2025 paper "LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid"
☆27Mar 2, 2025Updated last year
Alternatives and similar repositories for LeanQuant
Users that are interested in LeanQuant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Jan 12, 2026Updated 2 months ago
- ☆19Feb 4, 2025Updated last year
- [IJCAI 2023] CLE-ViT: Contrastive Learning Encoded Transformer for Ultra-Fine-Grained Visual Categorization.☆10Nov 3, 2023Updated 2 years ago
- [ACM MM2025]: MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization☆41Aug 13, 2025Updated 7 months ago
- HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arx…☆29Feb 17, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- PISCO: Precise Video Instance Insertion with Sparse Control☆56Feb 13, 2026Updated last month
- A tool which checks compatibility of CoreML model with Apple Neural Engine☆14May 30, 2022Updated 3 years ago
- PyTorch implementation of Language model compression with weighted low-rank factorization☆13Jun 28, 2023Updated 2 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- Everyone loves OS☆20Mar 3, 2026Updated last month
- A Low-Overhead tool for Floating-Point Exception Detection in NVIDIA GPUs☆13Dec 17, 2024Updated last year
- ☆10Jun 9, 2017Updated 8 years ago
- flex-block-attn: an efficient block sparse attention computation library☆128Dec 26, 2025Updated 3 months ago
- ☆21Feb 5, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…☆89Apr 8, 2025Updated last year
- ☆11Mar 25, 2022Updated 4 years ago
- ☆14Jun 9, 2017Updated 8 years ago
- Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention☆53Updated this week
- Code for paper "Conversational Product Search Based on Negative Feedback"☆12Jun 26, 2020Updated 5 years ago
- [ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling☆27Nov 11, 2025Updated 4 months ago
- -A library which does oversampling to allow you to read with a resolution of 10-bit to 21-bit on the Arduino ADC (Analog to Digital Conve…☆17Dec 29, 2016Updated 9 years ago
- egraph <-> json☆16Dec 29, 2025Updated 3 months ago
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆136May 16, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- ☆14Jan 10, 2024Updated 2 years ago
- [ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"☆212Nov 25, 2025Updated 4 months ago
- End2End Virtual Try-on with Visual Reference, CVPR2026☆60Mar 29, 2026Updated last week
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆51Oct 21, 2023Updated 2 years ago
- Simple intermediate representation language for learning and research.☆20Mar 27, 2020Updated 6 years ago
- Differentially Private Synthetic Data Generation [DP-SDG] - Experimental Setups & Knowledge Base - WORK IN PROGRESS☆12Jul 26, 2022Updated 3 years ago
- Using e-graphs to synthesize netlists from boolean logic.☆14Jul 26, 2023Updated 2 years ago
- Perceptron-based branch predictor written in C++☆13Dec 14, 2016Updated 9 years ago
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quan…☆129Sep 23, 2025Updated 6 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- A compiler of Decaf(an object-oriented compiler)☆12Sep 26, 2017Updated 8 years ago
- MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models☆28Apr 2, 2026Updated last week
- Benchmark datasets for sentiment analysis☆12May 18, 2020Updated 5 years ago
- A step-by-step tutorial about how to use Distributed Data Parallel feature of PyTorch☆16Nov 20, 2020Updated 5 years ago
- A collection of tricks and tools to speed up transformer models☆199Mar 31, 2026Updated last week
- ☆24Mar 6, 2023Updated 3 years ago
- MLIR+EqSat☆26Jan 10, 2026Updated 2 months ago