HITSZ-Miao-Group / Paper-Reading-ListLinks
☆9Updated last year
Alternatives and similar repositories for Paper-Reading-List
Users that are interested in Paper-Reading-List are comparing it to the libraries listed below
Sorting:
- List of papers related to neural network quantization in recent AI conferences and journals.☆679Updated 4 months ago
- Awesome list for LLM pruning.☆246Updated 7 months ago
- ☆267Updated 11 months ago
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languag…☆186Updated 6 months ago
- Awesome LLM compression research papers and tools.☆1,630Updated last month
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.☆111Updated 2 weeks ago
- [NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.☆165Updated 10 months ago
- A curated list for Efficient Large Language Models☆1,824Updated last month
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆869Updated this week
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,196Updated last month
- Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.☆414Updated 8 months ago
- A simple and effective LLM pruning approach.☆782Updated last year
- ☆26Updated last week
- [ACM MM2025]: MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization☆14Updated this week
- An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.☆21Updated last week
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆506Updated last week
- [ICLR2025]: OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitt…☆72Updated 4 months ago
- ☆23Updated 4 months ago
- [NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baich…☆1,051Updated 10 months ago
- Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.☆161Updated 11 months ago
- DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing (WACV 2025)☆10Updated 8 months ago
- Official implementation of the ICLR paper "Streamlining Redundant Layers to Compress Large Language Models"☆31Updated 3 months ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆303Updated 3 months ago
- Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)☆43Updated 4 months ago
- Code repo for the paper "SpinQuant LLM quantization with learned rotations"☆307Updated 5 months ago
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"☆73Updated last month
- [TKDE'25] The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".☆403Updated 3 weeks ago
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005☆32Updated 9 months ago
- A curated reading list of research in Mixture-of-Experts(MoE).☆641Updated 9 months ago
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆13Updated 3 months ago