schnell18/lm-quant-toolkit

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/schnell18/lm-quant-toolkit)

schnell18 / lm-quant-toolkit

LLM Quantization toolkit

☆20

Alternatives and similar repositories for lm-quant-toolkit

Users that are interested in lm-quant-toolkit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AniZpZ / smoothquant
View on GitHub
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
☆11Dec 13, 2023Updated 2 years ago
d3n7 / riffusionDJ
View on GitHub
Multichannel Looper/Feedback System for Riffusion
☆14May 6, 2023Updated 3 years ago
itp-redial / tinyphone
View on GitHub
a phone to app implementation using Asterisk and Node.js
☆15Oct 27, 2014Updated 11 years ago
kq-chen / VLMEvalKit
View on GitHub
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks
☆15Feb 17, 2025Updated last year
alibaba / EfficientAI
View on GitHub
☆48May 9, 2026Updated 2 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mttn2023 / mttn
View on GitHub
MTTN: Multi-Pair Text to Text Narratives for Prompt Generation
☆11Feb 4, 2023Updated 3 years ago
zyxxmu / Bi-Mask
View on GitHub
Pytorch implementation of our paper accepted by ICML 2023 -- "Bi-directional Masks for Efficient N:M Sparse Training"
☆13Jun 7, 2023Updated 3 years ago
Highlyhotgames / fast_txtgen
View on GitHub
☆12Apr 4, 2024Updated 2 years ago
TerryPei / CSP
View on GitHub
Cross-Self KV Cache Pruning for Efficient Vision-Language Inference
☆10Dec 15, 2024Updated last year
WailordHe / cv-arxiv-daily-wailord
View on GitHub
🎓Automatically Update CV Papers Daily using Github Actions (Update Every 12th hours)
☆12May 17, 2026Updated 2 months ago
BrotherHappy / OSTQuant
View on GitHub
[ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…
☆94Apr 8, 2025Updated last year
OpenGVLab / LLMPrune-BESA
View on GitHub
BESA is a differentiable weight pruning technique for large language models.
☆17Mar 4, 2024Updated 2 years ago
yuny220 / NAR-Former
View on GitHub
Pytorch code of [CVPR 2023] "NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction".
☆11Mar 14, 2023Updated 3 years ago
kssteven418 / SqueezeLLM-gradients
View on GitHub
☆21Feb 5, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
sufenlp / MiLoRA
View on GitHub
[NAACL 2025] MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning
☆21May 31, 2025Updated last year
MasterVito / DAC-RL
View on GitHub
Official Repo for DAC-RL: Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability
☆16Feb 26, 2026Updated 5 months ago
NVlabs / EoRA
View on GitHub
[ICLRW'26] EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation
☆49Apr 21, 2026Updated 3 months ago
danielfullmer / nzbfs
View on GitHub
NZB Filesystem using FUSE/Python
☆19Jan 13, 2016Updated 10 years ago
MarvinChung / HW5-TextStyleTransfer
View on GitHub
☆15Mar 17, 2021Updated 5 years ago
hahnyuan / ASVD4LLM
View on GitHub
Activation-aware Singular Value Decomposition for Compressing Large Language Models
☆92Oct 22, 2024Updated last year
wangqinsi1 / CoreInfer
View on GitHub
This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Act…
☆18Oct 25, 2024Updated last year
dlwns147 / amq
View on GitHub
[EMNLP 2025] AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models
☆16Apr 29, 2026Updated 2 months ago
0perationPrivacy / VoIPSuite-Mobile
View on GitHub
Mobile App code for Android & iOS on React Native
☆27Dec 7, 2023Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Adlik / smoothquantplus
View on GitHub
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
☆23Mar 15, 2024Updated 2 years ago
zzbright1998 / SentenceKV
View on GitHub
Official implementation of "SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching" (COLM 2025). A novel KV cache com…
☆15Sep 29, 2025Updated 9 months ago
Ph0rk0z / text-generation-webui-testing
View on GitHub
A fork of textgen that kept some things like Exllama and old GPTQ.
☆22Aug 20, 2024Updated last year
mlm-games / KernelSU-Non-GKI
View on GitHub
☆17Jun 18, 2026Updated last month
rickeylohia / plugin.video.stalkervod
View on GitHub
Stalker VOD Kodi add-on
☆15Aug 7, 2025Updated 11 months ago
wangqinsi1 / 2025-ICML-CoreMatching
View on GitHub
[ICML 2025] CoreMatching: Co-adaptive Sparse Inference Framework for Comprehensive Acceleration of Vision Language Model
☆16May 27, 2025Updated last year
random-robbie / mcp-web-browser
View on GitHub
An advanced web browsing server for the Model Context Protocol (MCP) powered by Playwright, enabling headless browser interactions throug…
☆27Mar 10, 2025Updated last year
nikopj / FlashAttention.jl
View on GitHub
Julia implementation of flash-attention operation for neural networks.
☆11May 31, 2023Updated 3 years ago
Akshay-Arjun / Video-Steganography
View on GitHub
AES 256 & RSA encrypted video steganography. SRU Hackathon 2022 - Cybersecurity Winners
☆26Jul 11, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
isakedo / DNNsim
View on GitHub
☆35Jul 9, 2020Updated 6 years ago
THU-MIG / PrefixKV
View on GitHub
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation [NeurIPS 2025]
☆19Oct 11, 2025Updated 9 months ago
MAC-AutoML / Awesome-Efficient-Large-Models
View on GitHub
A list of awesome papers on compression and acceleration of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs).
☆16May 12, 2026Updated 2 months ago
Dousia / MetricPrompt
View on GitHub
Code for KDD 2023 long paper: MetricPrompt: Prompting Model as a Relevance Metric for Few-Shot Text Classification
☆19Aug 10, 2024Updated last year
tjluyao / kv.run
View on GitHub
A model serving framework for various research and production scenarios. Seamlessly built upon the PyTorch and HuggingFace ecosystem.
☆23Oct 11, 2024Updated last year
IvanYashchuk / PyFenicsAD.jl
View on GitHub
Automatic differentiation of FEniCS and Firedrake models in Julia
☆14Mar 21, 2021Updated 5 years ago
HanGuo97 / lq-lora
View on GitHub
☆129Jan 22, 2024Updated 2 years ago