HITSZ-Miao-Group / Paper-Reading-List
☆9Updated last year
Alternatives and similar repositories for Paper-Reading-List:
Users that are interested in Paper-Reading-List are comparing it to the libraries listed below
- Awesome list for LLM pruning.☆222Updated 4 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.☆84Updated 4 months ago
- ☆237Updated 8 months ago
- [NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.☆157Updated 6 months ago
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languag…☆177Updated 2 months ago
- Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.☆379Updated 5 months ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆258Updated this week
- List of papers related to neural network quantization in recent AI conferences and journals.☆597Updated last month
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005☆27Updated 5 months ago
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"☆64Updated 10 months ago
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆696Updated last week
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆388Updated 2 weeks ago
- [NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baich…☆1,002Updated 6 months ago
- A simple and effective LLM pruning approach.☆738Updated 8 months ago
- The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".☆331Updated last month
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".☆118Updated last year
- ☆49Updated 4 months ago
- [ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache☆288Updated 3 months ago
- Code Repository of Evaluating Quantized Large Language Models☆121Updated 7 months ago
- Awesome list for LLM quantization☆202Updated 4 months ago
- ☆23Updated 3 weeks ago
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…☆46Updated 2 weeks ago
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization☆129Updated 2 months ago
- Survey Paper List - Efficient LLM and Foundation Models☆246Updated 7 months ago
- PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)☆347Updated 2 months ago
- Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)☆31Updated last month
- ☆40Updated 10 months ago
- Official implementation for LaCo (EMNLP 2024 Findings)☆16Updated 6 months ago
- [NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers☆185Updated 2 years ago
- Code repo for the paper "SpinQuant LLM quantization with learned rotations"☆259Updated 2 months ago