OpenGVLab / LLMPrune-BESALinks

BESA is a differentiable weight pruning technique for large language models.

☆17

Alternatives and similar repositories for LLMPrune-BESA

Users that are interested in LLMPrune-BESA are comparing it to the libraries listed below

Sorting:

MingSun-Tse / Why-the-State-of-Pruning-so-Confusing
[Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…
☆41Updated 2 months ago
zyxxmu / DSnoT
Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…
☆50Updated last year
ziplab / QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…
☆30Updated last year
shawnricecake / search-llm
[NeurIPS 2024] Search for Efficient LLMs
☆15Updated 10 months ago
VILA-Lab / GBLM-Pruner
Are gradient information useful for pruning of LLMs?
☆47Updated 3 months ago
htqin / IR-QLoRA
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…
☆67Updated last year
imagination-research / LCSC
[ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
☆16Updated 9 months ago
zju-vipa / training_free_model_merging
This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).
☆32Updated last year
yxli2123 / LoSparse
☆62Updated 2 years ago
ylsung / ECoFLaP
Code for "ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models" (ICLR 2024)
☆20Updated last year
ModelTC / QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…
☆39Updated last year
VITA-Group / BackRazor_Neurips22
[Neurips 2022] “ Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropogation”, Ziyu Jiang*, Xuxi Chen*, Xueqin Huan…
☆20Updated 2 years ago
shoaibahmed / llm_depth_pruning
Official implementation of the paper: "A deeper look at depth pruning of LLMs"
☆15Updated last year
wzhuang-xmu / LoSA
[ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".
☆23Updated 8 months ago
pprp / Pruner-Zero
[ICML24] Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs
☆96Updated last year
GATECH-EIC / Linearized-LLM
[ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
☆35Updated last year
mrflogs / LoRA-Pro
Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "
☆137Updated 8 months ago
Lucky-Lance / SPP
[ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
☆21Updated last year
TianjinYellow / StableSPAM
☆27Updated 8 months ago
MingSun-Tse / TPP
[ICLR'23] Trainability Preserving Neural Pruning (PyTorch)
☆34Updated 2 years ago
VITA-Group / Junk_DNA_Hypothesis
[ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…
☆16Updated 7 months ago
HuangOwen / QAT-ACS
[TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"
☆35Updated last year
ilur98 / DGQ
Official Code For Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
☆14Updated last year
cliang1453 / task-aware-distillation
Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)
☆40Updated 2 years ago
ROIM1998 / APT
[ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
☆46Updated last year
ggjy / vision_weak_to_strong
☆38Updated last year
leo-yangli / VB-LoRA
This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).
☆42Updated last year
HuangOwen / RoLoRA
[EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
☆38Updated last year
pixeli99 / MixLN
[ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…
☆27Updated 4 months ago
ModelTC / L2_Compression
☆13Updated last year