HITSZ-Miao-Group / Paper-Reading-List
β9Updated last year
Alternatives and similar repositories for Paper-Reading-List
Users that are interested in Paper-Reading-List are comparing it to the libraries listed below
Sorting:
- Awesome list for LLM pruning.β224Updated 5 months ago
- [NeurIPS 2024 Oralπ₯] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.β159Updated 7 months ago
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.β86Updated 5 months ago
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languagβ¦β180Updated 3 months ago
- β245Updated 8 months ago
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"β66Updated 10 months ago
- Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.β387Updated 5 months ago
- β53Updated 5 months ago
- β23Updated this week
- An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.β19Updated last month
- Official implementation of the ICLR paper "Streamlining Redundant Layers to Compress Large Language Models"β26Updated 2 weeks ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)β263Updated 3 weeks ago
- β41Updated 11 months ago
- Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)β34Updated 2 months ago
- [NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichβ¦β1,011Updated 7 months ago
- β14Updated 2 months ago
- [ICML 2024] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".β102Updated 10 months ago
- Code Repository of Evaluating Quantized Large Language Modelsβ123Updated 8 months ago
- A simple and effective LLM pruning approach.β746Updated 9 months ago
- π° Must-read papers and blogs on Speculative Decoding β‘οΈβ725Updated last week
- β10Updated last year
- Awasome Papers and Resources in Deep Neural Network Pruning with Source Code.β159Updated 8 months ago
- Official implementation for LaCo (EMNLP 2024 Findings)β16Updated 7 months ago
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decompositionβ12Updated last month
- List of papers related to neural network quantization in recent AI conferences and journals.β618Updated last month
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retentiβ¦β65Updated last year
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**β185Updated 3 months ago
- [NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformersβ189Updated 2 years ago
- [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.β444Updated 9 months ago
- PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)β352Updated 3 months ago