utkarsh-dmx/project-resq

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/utkarsh-dmx/project-resq)

utkarsh-dmx / project-resq

☆35

Alternatives and similar repositories for project-resq

Users that are interested in project-resq are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JingyangXiang / DFRot
View on GitHub
[COLM 2025] DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation; 知乎：https://zhuanlan.zhihu.c…
☆30Mar 5, 2025Updated last year
amoghj98 / rcac-utils
View on GitHub
This repository contains bash scripts for launching, orchestrating, managing, and monitoring jobs on Purdue's RCAC clusters.
☆23Dec 22, 2025Updated 7 months ago
Sakshi09Ch / CoDeC
View on GitHub
[TMLR] CoDeC: Communication-Efficient Decentralized Continual Learning
☆13Apr 17, 2024Updated 2 years ago
ruikangliu / FlatQuant
View on GitHub
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
☆223Nov 25, 2025Updated 8 months ago
Intelligent-Computing-Lab-Panda / TesseraQ
View on GitHub
☆25Oct 31, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Xingyu-Zheng / FOEM
View on GitHub
(AAAI 2026) First-Order Error Matters: Accurate Compensation for Quantized Large Language Models
☆16Apr 16, 2026Updated 3 months ago
snu-mllab / GuidedQuant
View on GitHub
Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)
☆53Apr 13, 2026Updated 3 months ago
Qualcomm-AI-research / lr-qat
View on GitHub
☆54Nov 5, 2024Updated last year
IST-DASLab / FP-Quant
View on GitHub
☆115Feb 26, 2026Updated 5 months ago
ylsung / rsq
View on GitHub
Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"
☆23Mar 25, 2026Updated 4 months ago
ChengZhang-98 / LQER
View on GitHub
Official implementation of ICML'24 paper "LQER: Low-Rank Quantization Error Reconstruction for LLMs"
☆19Jul 11, 2024Updated 2 years ago
BrotherHappy / OSTQuant
View on GitHub
[ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…
☆94Apr 8, 2025Updated last year
lhxcs / DVD-Quant
View on GitHub
☆17Oct 5, 2025Updated 9 months ago
thu-nics / PM-KVQ
View on GitHub
The official code implementation for paper "PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs"
☆29May 24, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Hsu1023 / DuQuant
View on GitHub
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
☆186Apr 24, 2026Updated 3 months ago
xvyaward / owq
View on GitHub
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…
☆72Mar 7, 2024Updated 2 years ago
sangamesh-kodge / class_forgetting
View on GitHub
[Deep Unlearning-PyTorch] Class Forgetting as in paper "Deep Unlearning: Fast and Efficient Training-free Approach to Controlled Forgetti…
☆16Jul 26, 2024Updated 2 years ago
lwy2020 / MicroMix
View on GitHub
MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models
☆28Apr 2, 2026Updated 3 months ago
DensoITLab / bitprune
View on GitHub
☆11Apr 5, 2023Updated 3 years ago
HuangOwen / Quantization-Variation
View on GitHub
[TMLR] Official PyTorch implementation of paper "Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precisio…
☆49Sep 27, 2024Updated last year
zhangsichengsjtu / AFPQ
View on GitHub
AFPQ code implementation
☆23Nov 6, 2023Updated 2 years ago
zysxmu / DFSQ
View on GitHub
super-resolution; post-training quantization; model compression
☆14Nov 10, 2023Updated 2 years ago
Aaronhuang-778 / SliM-LLM
View on GitHub
[ICML 2025] SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
☆62Aug 9, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
ChenMnZ / PrefixQuant
View on GitHub
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
☆176Nov 26, 2025Updated 8 months ago
iLearn-Lab / ACL25-PTQ1.61
View on GitHub
☆15Apr 6, 2026Updated 3 months ago
HandH1998 / QQQ
View on GitHub
QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.
☆158Aug 21, 2025Updated 11 months ago
facebookresearch / SpinQuant
View on GitHub
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
☆417Feb 14, 2025Updated last year
Axel-gu / DenoiseRotator
View on GitHub
☆23Nov 26, 2025Updated 8 months ago
JunyiWuCode / QuantCache
View on GitHub
[ICCV 2025] QuantCache：Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation
☆18Sep 26, 2025Updated 10 months ago
lliai / EMQ-series
View on GitHub
[ICCV-2023] EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization
☆29Dec 6, 2023Updated 2 years ago
efeslab / Atom
View on GitHub
[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
☆344Jul 2, 2024Updated 2 years ago
INT-FlashAttention2024 / INT-FlashAttention
View on GitHub
☆91Jan 23, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
spcl / QuaRot
View on GitHub
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
☆523Nov 26, 2024Updated last year
UNITES-Lab / MoE-Quantization
View on GitHub
Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"
☆31Jun 30, 2025Updated last year
manishnagaraj / DOTIE
View on GitHub
Implementation of DOTIE - Detecting Objects through Temporal Isolation of Events using a Spiking Architecture
☆33Sep 27, 2023Updated 2 years ago
StiphyJay / MQuant
View on GitHub
[ACM MM2025]: MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization
☆44Aug 13, 2025Updated 11 months ago
wlfeng0509 / Q-VDiT
View on GitHub
(ICML-2025) Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers
☆21Aug 13, 2025Updated 11 months ago
IST-DASLab / qutlass
View on GitHub
QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning
☆192Updated this week
facebookresearch / ParetoQ
View on GitHub
This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"
☆131Oct 15, 2025Updated 9 months ago