yangyifei729/LaCo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yangyifei729/LaCo)

yangyifei729 / LaCo

Official implementation for LaCo (EMNLP 2024 Findings)

☆22

Alternatives and similar repositories for LaCo

Users that are interested in LaCo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jiwonsong-dev / SLEB
View on GitHub
[ICML 2024] Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
☆42Feb 4, 2025Updated last year
luuyin / OWL
View on GitHub
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
☆82Jul 7, 2025Updated last year
SempraETY / Pruning-via-Merging
View on GitHub
☆23Nov 26, 2024Updated last year
goodevening13 / aquakv
View on GitHub
☆21Apr 27, 2026Updated 3 months ago
Beryex / RLPruner-CNN
View on GitHub
RL-Pruner: Structured Pruning Using Reinforcement Learning for CNN Compression and Acceleration
☆29Jun 8, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
aim-uofa / LoRAPrune
View on GitHub
☆63Dec 15, 2024Updated last year
haiquanlu / AlphaPruning
View on GitHub
[NeurIPS 2024] AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
☆34Jun 9, 2025Updated last year
imagination-research / EEP
View on GitHub
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
☆25Nov 11, 2025Updated 8 months ago
RUCKBReasoning / LLM-Streamline
View on GitHub
Official implementation of the ICLR paper "Streamlining Redundant Layers to Compress Large Language Models"
☆43May 1, 2025Updated last year
CASIA-LMC-Lab / FLAP
View on GitHub
[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models
☆76Jan 6, 2024Updated 2 years ago
sdc17 / UPop
View on GitHub
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
☆103Dec 30, 2024Updated last year
wazenmai / HC-SMoE
View on GitHub
[ICML 2025] Retraining-Free Merging of Sparse MoE via Hierarchical Clustering
☆25Oct 26, 2025Updated 9 months ago
BaiTheBest / SparseLLM
View on GitHub
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
☆70Mar 27, 2025Updated last year
keeeeenw / TinyLlama
View on GitHub
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆14Mar 30, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
melisa-writer / short-transformers
View on GitHub
Prune transformer layers
☆74May 30, 2024Updated 2 years ago
FarinaMatteo / multiflow
View on GitHub
[CVPR '24] Official implementation of the paper "Multiflow: Shifting Towards Task-Agnostic Vision-Language Pruning".
☆24Mar 7, 2025Updated last year
locuslab / wanda
View on GitHub
A simple and effective LLM pruning approach.
☆868Aug 9, 2024Updated last year
tanvir-utexas / PaPr
View on GitHub
☆13Jul 3, 2024Updated 2 years ago
FarinaMatteo / qmmf
View on GitHub
[CVPR '23 Highlight] Official repository for the paper "Quantum Multi-Model Fitting".
☆11Mar 7, 2025Updated last year
fscdc / Awesome-Efficient-Reasoning-Models
View on GitHub
[TMLR 2025] Efficient Reasoning Models: A Survey
☆317Jun 26, 2026Updated last month
francescotonini / al-gtd
View on GitHub
Official repo of the paper “AL-GTD: Deep Active Learning for Gaze Target Detection” (ACMMM2024)
☆12Updated this week
CASE-Lab-UMD / Unified-MoE-Compression
View on GitHub
The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".
☆89Feb 28, 2026Updated 5 months ago
krafton-ai / lexico
View on GitHub
KV cache compression via sparse coding
☆17Oct 26, 2025Updated 9 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
innvariant / deepstruct
View on GitHub
Sparse Neural Network Tools
☆12Jul 15, 2024Updated 2 years ago
RobertTLange / es-lottery
View on GitHub
Lottery Tickets in Evolutionary Optimization (Lange & Sprekeler, ICML 2023)
☆17Jun 2, 2023Updated 3 years ago
csguoh / OBR
View on GitHub
[ICLR2026] The first W4A4KV4 quantized + 50% sparse LLMs!
☆33Jan 26, 2026Updated 6 months ago
Shiweiliuiiiiiii / In-Time-Over-Parameterization
View on GitHub
[ICML 2021] "Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training" by Shiwei Liu, Lu Yin, De…
☆46Nov 11, 2023Updated 2 years ago
ZhangAIPI / YOPO_MLLM_Pruning
View on GitHub
Pruning the VLLMs
☆106Dec 9, 2024Updated last year
yolky / RCIG
View on GitHub
☆15Apr 25, 2023Updated 3 years ago
VITA-Group / ramanujan-on-pai
View on GitHub
[ICLR 2023] 'Revisiting Pruning At Initialization Through The Lens of Ramanujan Graph" by Duc Hoang, Shiwei Liu, Radu Marculescu, Atlas W…
☆14Aug 4, 2023Updated 2 years ago
ksreenivasan / pruning_is_enough
View on GitHub
Pruning is all you need (hopefully)
☆12Sep 7, 2022Updated 3 years ago
RoySegal / tvmcon23_byoc
View on GitHub
☆11Mar 15, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
declare-lab / della
View on GitHub
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
☆37Jul 12, 2024Updated 2 years ago
horseee / LLM-Pruner
View on GitHub
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baich…
☆1,133Oct 7, 2024Updated last year
shawnricecake / search-llm
View on GitHub
[NeurIPS 2024] Search for Efficient LLMs
☆16Jan 16, 2025Updated last year
yxli2123 / LoSparse
View on GitHub
☆64Oct 17, 2023Updated 2 years ago
Nota-NetsPresso / SNP
View on GitHub
Structured Neuron Level Pruning to compress Transformer-based models [ECCV'24]
☆16Aug 7, 2024Updated last year
ZIB-IOL / SMS
View on GitHub
Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"
☆12Oct 14, 2025Updated 9 months ago
zyxxmu / DSnoT
View on GitHub
Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…
☆51Apr 9, 2024Updated 2 years ago