HITSZ-Miao-Group / Paper-Reading-ListLinks

☆9

Alternatives and similar repositories for Paper-Reading-List

Users that are interested in Paper-Reading-List are comparing it to the libraries listed below

Sorting:

Zhen-Dong / Awesome-Quantization-Papers
List of papers related to neural network quantization in recent AI conferences and journals.
☆679Updated 4 months ago
pprp / Awesome-LLM-Prune
Awesome list for LLM pruning.
☆246Updated 7 months ago
hrcheng1066 / awesome-pruning
☆267Updated 11 months ago
Efficient-ML / Awesome-Efficient-AIGC
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languag…
☆186Updated 6 months ago
HuangOwen / Awesome-LLM-Compression
Awesome LLM compression research papers and tools.
☆1,630Updated last month
liyunqianggyn / Awesome-LLMs-Pruning
Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.
☆111Updated 2 weeks ago
Hsu1023 / DuQuant
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.
☆165Updated 10 months ago
horseee / Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
☆1,824Updated last month
hemingkx / SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
☆869Updated this week
AIoT-MLSys-Lab / Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey
☆1,196Updated last month
spcl / QuaRot
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
☆414Updated 8 months ago
locuslab / wanda
A simple and effective LLM pruning approach.
☆782Updated last year
IST-DASLab / EvoPress
☆26Updated last week
StiphyJay / MQuant
[ACM MM2025]: MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization
☆14Updated this week
ZhengaoLi / DISP-LLM-Dimension-Independent-Structural-Pruning
An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.
☆21Updated last week
October2001 / Awesome-KV-Cache-Compression
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
☆506Updated last week
BrotherHappy / OSTQuant
[ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…
☆72Updated 4 months ago
antgroup / cakekv
☆23Updated 4 months ago
horseee / LLM-Pruner
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baich…
☆1,051Updated 10 months ago
ghimiredhikura / Awasome-Pruning
Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.
☆161Updated 11 months ago
DZY122 / DiTAS
DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing (WACV 2025)
☆10Updated 8 months ago
RUCKBReasoning / LLM-Streamline
Official implementation of the ICLR paper "Streamlining Redundant Layers to Compress Large Language Models"
☆31Updated 3 months ago
hemingkx / Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
☆303Updated 3 months ago
HArmonizedSS / HASS
Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)
☆43Updated 4 months ago
facebookresearch / SpinQuant
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
☆307Updated 5 months ago
luuyin / OWL
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
☆73Updated last month
withinmiaov / A-Survey-on-Mixture-of-Experts-in-LLMs
[TKDE'25] The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".
☆403Updated 3 weeks ago
adreamwu / PTQ4DiT
PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005
☆32Updated 9 months ago
codecaution / Awesome-Mixture-of-Experts-Papers
A curated reading list of research in Mixture-of-Experts(MoE).
☆641Updated 9 months ago
stephenqz / OATS
Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition
☆13Updated 3 months ago