IST-DASLab/QuEST

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IST-DASLab/QuEST)

IST-DASLab / QuEST

Work in progress.

☆80

Alternatives and similar repositories for QuEST

Users that are interested in QuEST are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

IST-DASLab / Quartet
View on GitHub
☆126Mar 18, 2026Updated 4 months ago
PingchengDong / GQA-LUT
View on GitHub
The official implementation of the DAC 2024 paper GQA-LUT
☆24Dec 20, 2024Updated last year
IST-DASLab / MoE-Quant
View on GitHub
Code for data-aware compression of DeepSeek models
☆75Dec 11, 2025Updated 7 months ago
Anonymous1252022 / Megatron-DeepSpeed
View on GitHub
☆18Sep 22, 2024Updated last year
Anonymous1252022 / fp4-all-the-way
View on GitHub
☆52May 20, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
thu-ml / TetraJet-MXFP4Training
View on GitHub
Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training
☆40May 4, 2026Updated 2 months ago
HuangOwen / RoLoRA
View on GitHub
[EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
☆40Sep 24, 2024Updated last year
TianjinYellow / SPAM-Optimizer
View on GitHub
☆36Mar 12, 2025Updated last year
krafton-ai / lexico
View on GitHub
KV cache compression via sparse coding
☆17Oct 26, 2025Updated 8 months ago
Cornell-RelaxML / qtip
View on GitHub
☆180Jun 22, 2025Updated last year
IST-DASLab / Quartet-II
View on GitHub
Quartet II Official Code
☆75May 1, 2026Updated 2 months ago
microsoft / VPTQ
View on GitHub
VPTQ, A Flexible and Extreme low-bit quantization algorithm
☆681Apr 25, 2025Updated last year
IST-DASLab / sparseprop
View on GitHub
☆16Sep 27, 2023Updated 2 years ago
HanGuo97 / flute
View on GitHub
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
☆391Apr 13, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ChenMnZ / INT_vs_FP
View on GitHub
[ICML 2026]A framework to compare low-bit integer and float-point formats
☆81May 6, 2026Updated 2 months ago
IST-DASLab / Sparse-Marlin
View on GitHub
Boosting 4-bit inference kernels with 2:4 Sparsity
☆96Sep 4, 2024Updated last year
ChenMnZ / PrefixQuant
View on GitHub
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
☆176Nov 26, 2025Updated 7 months ago
IST-DASLab / gptq-gguf-toolkit
View on GitHub
Efficient non-uniform quantization with GPTQ for GGUF
☆64Sep 17, 2025Updated 10 months ago
facebookresearch / ParetoQ
View on GitHub
This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"
☆131Oct 15, 2025Updated 9 months ago
IST-DASLab / HALO
View on GitHub
HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arx…
☆31Feb 17, 2025Updated last year
zqOuO / GWT
View on GitHub
☆13May 4, 2026Updated 2 months ago
dmis-lab / Outlier-Safe-Pre-Training
View on GitHub
[ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
☆39Nov 4, 2025Updated 8 months ago
Intelligent-Computing-Lab-Panda / TesseraQ
View on GitHub
☆25Oct 31, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / SpinQuant
View on GitHub
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
☆415Feb 14, 2025Updated last year
IST-DASLab / torch_cgx
View on GitHub
Pytorch distributed backend extension with compression support
☆17Mar 24, 2025Updated last year
pprp / STBLLM
View on GitHub
[ICLR25] STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
☆20Jun 3, 2025Updated last year
amazon-science / mxfp4-llm
View on GitHub
Official implementation for Training LLMs with MXFP4
☆130Apr 25, 2025Updated last year
AIRI-Institute / xland-minigrid-datasets
View on GitHub
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
☆14Jun 19, 2024Updated 2 years ago
IST-DASLab / FP-Quant
View on GitHub
☆114Feb 26, 2026Updated 4 months ago
akhilkedia / TranformersGetStable
View on GitHub
[ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"
☆11Jul 19, 2024Updated 2 years ago
tim-lawson / skip-middle
View on GitHub
Learning to Skip the Middle Layers of Transformers
☆17Aug 7, 2025Updated 11 months ago
NolanoOrg / SpectraSuite
View on GitHub
☆54Jul 18, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
nbasyl / OFQ
View on GitHub
The official implementation of the ICML 2023 paper OFQ-ViT
☆39Oct 3, 2023Updated 2 years ago
Dao-AILab / fast-hadamard-transform
View on GitHub
Fast Hadamard transform in CUDA, with a PyTorch interface
☆338Mar 10, 2026Updated 4 months ago
IST-DASLab / qutlass
View on GitHub
QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning
☆191Updated this week
NVlabs / MaskLLM
View on GitHub
[NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models
☆189Jan 1, 2025Updated last year
corl-team / rebased
View on GitHub
Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"
☆169Jan 16, 2025Updated last year
aleksandrinvictor / flow-matching
View on GitHub
Simple reimplementation of Flow Matching for Generative Modeling (https://arxiv.org/abs/2210.02747) paper in PyTorch
☆23Aug 10, 2024Updated last year
FusionBrainLab / LLM-Microscope
View on GitHub
☆72Aug 27, 2024Updated last year