pprp/STBLLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pprp/STBLLM)

pprp / STBLLM

[ICLR25] STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs

☆20

Alternatives and similar repositories for STBLLM

Users that are interested in STBLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

racoonykc / RobuQ
View on GitHub
RobuQ: Pushing DiTS to W1.58A2 via Robust Activation Quantization
☆15Jun 28, 2026Updated 3 weeks ago
pprp / ACBench
View on GitHub
[ICML25] Agentic Compression Benchmark (ACBench)
☆17Jul 2, 2025Updated last year
ZHITENGLI / ARB-LLM
View on GitHub
[ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models
☆30Aug 5, 2025Updated 11 months ago
facebookresearch / ParetoQ
View on GitHub
This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"
☆131Oct 15, 2025Updated 9 months ago
Xingyu-Zheng / FOEM
View on GitHub
(AAAI 2026) First-Order Error Matters: Accurate Compensation for Quantized Large Language Models
☆16Apr 16, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
insuhan / calibquant
View on GitHub
☆21Apr 3, 2025Updated last year
XIANGLONGYAN / PBS2P
View on GitHub
PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"
☆13Jul 11, 2026Updated last week
kssteven418 / SqueezeLLM-gradients
View on GitHub
☆21Feb 5, 2024Updated 2 years ago
lliai / D2MoE
View on GitHub
D^2-MoE: Delta Decompression for MoE-based LLMs Compression
☆82Mar 25, 2025Updated last year
Yeyke / HBLLM
View on GitHub
[NeurIPS 2025 (spotlight)] HBLLM: Wavelet-Enhanced High-Fidelity 1-Bit Quantization for LLMs
☆16Dec 17, 2025Updated 7 months ago
IST-DASLab / FP-Quant
View on GitHub
☆114Feb 26, 2026Updated 4 months ago
BrotherHappy / OSTQuant
View on GitHub
[ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…
☆93Apr 8, 2025Updated last year
YanjingLi0202 / Bi-ViT
View on GitHub
The official implementation of the AAAI 2024 paper Bi-ViT.
☆13Dec 18, 2023Updated 2 years ago
Aaronhuang-778 / SliM-LLM
View on GitHub
[ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
☆62Aug 9, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
xunzhang1128 / Q-DiT4SR
View on GitHub
[ICML 2026] Q-DiT4SR: Exploration of Detail-Preserving Diffusion Transformer Quantization for Real-World Image Super-Resolution
☆19May 1, 2026Updated 2 months ago
Qualcomm-AI-research / pruning-vs-quantization
View on GitHub
☆26Mar 1, 2024Updated 2 years ago
dmis-lab / Outlier-Safe-Pre-Training
View on GitHub
[ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
☆39Nov 4, 2025Updated 8 months ago
Ther-nullptr / circult-eda-mlsys-tinyml-arxiv-daily
View on GitHub
🎓Automatically Update circult-eda-mlsys-tinyml Papers Daily using Github Actions (Update Every 8th hours)
☆10Jul 13, 2026Updated last week
FlyEgle / imageclassification
View on GitHub
General Image Classification Code base
☆22Jul 12, 2021Updated 5 years ago
nbasyl / OFQ
View on GitHub
The official implementation of the ICML 2023 paper OFQ-ViT
☆39Oct 3, 2023Updated 2 years ago
wlfeng0509 / Q-VDiT
View on GitHub
(ICML-2025) Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers
☆20Aug 13, 2025Updated 11 months ago
OpenBitSys / BitDistiller
View on GitHub
[ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.
☆139May 16, 2024Updated 2 years ago
Aaronhuang-778 / BiLLM
View on GitHub
[ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
☆235Jan 11, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
liyunqianggyn / Awesome-LLMs-Pruning
View on GitHub
Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.
☆172May 19, 2026Updated 2 months ago
cjf00000 / StatQuant
View on GitHub
code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"
☆29Oct 31, 2020Updated 5 years ago
Yanqi-Chen / LATS
View on GitHub
To appear in the 11th International Conference on Learning Representations (ICLR 2023).
☆18Feb 24, 2023Updated 3 years ago
ruikangliu / Quantized-Reasoning-Models
View on GitHub
[COLM 2025] Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"
☆77Jul 8, 2025Updated last year
pprp / Pruner-Zero
View on GitHub
[ICML24] Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs
☆100Nov 25, 2024Updated last year
Kai-Liu001 / Awesome-Model-Quantization
View on GitHub
This repository contains low-bit quantization papers from 2020 to 2026 on top conference.
☆192Jun 25, 2026Updated 3 weeks ago
inclusionAI / MoBE
View on GitHub
Mixture-of-Basis-Experts for Compressing MoE-based LLMs
☆37Dec 24, 2025Updated 6 months ago
ModelTC / QVGen
View on GitHub
[ICLR 2026] This is the official PyTorch implementation of "QVGen: Pushing the Limit of Quantized Video Generative Models".
☆32Feb 11, 2026Updated 5 months ago
zhuhanqing / Lightening-Transformer-AE
View on GitHub
Artifact evaluation for HPCA'24 paper Lightening-Transformer: A Dynamically-operated Optically-interconnected Photonic Transformer Accele…
☆11Mar 3, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Aaronhuang-778 / Mixture-Compressor-MoE
View on GitHub
[ICLR 2025, IEEE TPAMI 2026] Mixture Compressor & MC#
☆75Feb 12, 2025Updated last year
DSA-MLOPS / DSAA6000I
View on GitHub
☆28Jun 9, 2024Updated 2 years ago
Qualcomm-AI-research / gptvq
View on GitHub
☆42Mar 28, 2024Updated 2 years ago
AI-Efficiency / Qwen3-Quantization-Toolkit
View on GitHub
☆79Sep 19, 2025Updated 10 months ago
AI-Efficiency / BiBench
View on GitHub
[ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…
☆56Mar 4, 2024Updated 2 years ago
arashardakani / Learning-Recurrent-Binary-Ternary-Weights
View on GitHub
Learning-Recurrent-Binary-Ternary-Weights
☆13Dec 4, 2018Updated 7 years ago
AIoT-MLSys-Lab / QuantVLA
View on GitHub
[CVPR'26] QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
☆42Mar 25, 2026Updated 3 months ago