ModelTC/awesome-lm-system

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ModelTC/awesome-lm-system)

ModelTC / awesome-lm-system

Summary of system papers/frameworks/codes/tools on training or serving large model

☆57

Alternatives and similar repositories for awesome-lm-system

Users that are interested in awesome-lm-system are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ModelTC / quant_horizon
View on GitHub
☆11Jan 10, 2025Updated last year
ModelTC / NART
View on GitHub
NART = NART is not A RunTime, a deep learning inference framework.
☆37Mar 2, 2023Updated 3 years ago
ModelTC / Outlier_Suppression_Plus
View on GitHub
Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…
☆52Oct 21, 2023Updated 2 years ago
ModelTC / EasyLLM
View on GitHub
Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing …
☆49Sep 18, 2024Updated last year
Qualcomm-AI-research / BayesianBits
View on GitHub
☆22Feb 11, 2022Updated 4 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ModelTC / LPCV2021_Winner_Solution
View on GitHub
☆28Nov 5, 2021Updated 4 years ago
jgoeders / dac_sdc_2021_designs
View on GitHub
☆19Mar 16, 2022Updated 4 years ago
moranshkolnik / RobustQuantization
View on GitHub
source code of the paper: Robust Quantization: One Model to Rule Them All
☆42Mar 24, 2023Updated 3 years ago
wimh966 / QDrop
View on GitHub
The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quan…
☆132Sep 23, 2025Updated 10 months ago
Adamdad / Samesame
View on GitHub
An Tensorflow.keras implementation of Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorizatio…
☆10Dec 18, 2019Updated 6 years ago
ModelTC / L2_Compression
View on GitHub
☆13Jun 16, 2024Updated 2 years ago
ModelTC / msbench
View on GitHub
A tool for model sparse based on torch.fx
☆13Jun 3, 2024Updated 2 years ago
ysbsb / awesome-quantization
View on GitHub
Awesome Quantization Paper lists with Codes
☆10Feb 24, 2021Updated 5 years ago
heterosys / mlir-vitis
View on GitHub
📥 🎯 (1,4/4) an MLIR-based toolchain with Vitis HLS LLVM input/output targeting FPGAs.
☆15Nov 15, 2022Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
ModelTC / Dipoorlet
View on GitHub
Offline Quantization Tools for Deploy.
☆143Dec 28, 2023Updated 2 years ago
1157942086 / CVPR2020_Auxiliary_Quantization
View on GitHub
Training Quantized Neural Networks with a Full-precision Auxiliary Module
☆13Jun 19, 2020Updated 6 years ago
ModelTC / LPCV_2023_solution
View on GitHub
☆17Nov 29, 2023Updated 2 years ago
ModelTC / LightTTS
View on GitHub
LightTTS is a lightweight TTS inference framework optimized for CosyVoice2 and CosyVoice3, enabling fast and scalable speech synthesis in…
☆47Apr 14, 2026Updated 3 months ago
enyac-group / evol-q
View on GitHub
Quantization in the Jagged Loss Landscape of Vision Transformers
☆13Oct 22, 2023Updated 2 years ago
allenbai01 / ProxQuant
View on GitHub
ProxQuant: Quantized Neural Networks via Proximal Operators
☆30Feb 19, 2019Updated 7 years ago
tinganchen / AlignQ
View on GitHub
[CVPR 2022] AlignQ: Alignment Quantization with ADMM-based Correlation Preservation
☆11Jan 6, 2023Updated 3 years ago
JinjieNi / MixEval-X
View on GitHub
The official github repo for MixEval-X, the first any-to-any, real-world benchmark.
☆17Feb 15, 2025Updated last year
deJQK / FracBits
View on GitHub
Neural Network Quantization With Fractional Bit-widths
☆11Feb 19, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SqueezeAILab / KVQuant
View on GitHub
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
☆431Aug 13, 2024Updated last year
SHI-Labs / Any-Precision-DNNs
View on GitHub
Any-Precision Deep Neural Networks (AAAI 2021)
☆62May 2, 2020Updated 6 years ago
cat538 / SKVQ
View on GitHub
[COLM 2024] SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models
☆24Oct 5, 2024Updated last year
kssteven418 / Q-ASR
View on GitHub
[ICASSP'22] Integer-only Zero-shot Quantization for Efficient Speech Recognition
☆34Oct 11, 2021Updated 4 years ago
jgoeders / dac_sdc_2020_designs
View on GitHub
Designs for finalist teams of the DAC System Design Contest
☆37Jul 8, 2020Updated 6 years ago
HPMLL / BurstGPT
View on GitHub
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆280Jun 30, 2026Updated 3 weeks ago
modelscope / Katz
View on GitHub
[ATC'25] Katz is a high-performance serving system designed specifically for diffusion model workflows with multiple adapters.
☆24May 26, 2025Updated last year
AChen-qaq / ProML
View on GitHub
Code for paper "Prompt-Based Metric Learning for Few-shot NER".
☆22Nov 14, 2023Updated 2 years ago
xushoukai / GDFQ
View on GitHub
official implementation of Generative Low-bitwidth Data Free Quantization(GDFQ)
☆55Jul 23, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ModelTC / HarmoniCa
View on GitHub
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching i…
☆46Jul 10, 2025Updated last year
SqueezeAILab / open_source_projects
View on GitHub
Open Source Projects from Pallas Lab
☆21Oct 10, 2021Updated 4 years ago
flyinglandlord / BUAA-CO-2021
View on GitHub
2021秋北航计算机组成原理
☆15Mar 31, 2022Updated 4 years ago
ZhangYuQAQ / Hardware-Acceleration-Circuit-Design-of-Object-Detection-Network-Based-on-FPGA
View on GitHub
2020 xilinx summer school
☆20Aug 13, 2020Updated 5 years ago
DIG-Beihang / RobustART
View on GitHub
The first comprehensive Robustness investigation benchmark on large-scale dataset ImageNet regarding ARchitecture design and Training tec…
☆149Feb 4, 2026Updated 5 months ago
ziplab / QTool
View on GitHub
Collections of model quantization algorithms. Any issues, please contact Peng Chen (blueardour@gmail.com)
☆73Oct 7, 2021Updated 4 years ago
Adlik / smoothquantplus
View on GitHub
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
☆23Mar 15, 2024Updated 2 years ago