The official implementation of the paper "Uncovering the Redundancy in Transformers via a Unified Study of Layer Dropping (TMLR)".
☆189Mar 6, 2026Updated last month
Alternatives and similar repositories for LLM-Drop
Users that are interested in LLM-Drop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Source code of EMNLP 2022 Findings paper "SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters"☆22Feb 28, 2026Updated last month
- The open-source Mixture of Depths code and the official implementation of the paper "Router-Tuning: A Simple and Effective Approach for E…☆31Mar 26, 2026Updated 2 weeks ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆45Feb 28, 2026Updated last month
- The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".☆90Feb 28, 2026Updated last month
- The official implementation of the paper "Rethinking Pruning for Vision-Language Models: Strategies for Effective Sparsity".☆16Jul 2, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Jul 24, 2025Updated 8 months ago
- ☆16Jul 23, 2024Updated last year
- Pytorch Code for FedHyper☆11Aug 28, 2024Updated last year
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆157Apr 7, 2025Updated last year
- Code for "ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models" (ICLR 2024)☆20Feb 16, 2024Updated 2 years ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆61Feb 7, 2025Updated last year
- ☆14Aug 18, 2022Updated 3 years ago
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆23Mar 4, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆21Feb 10, 2025Updated last year
- Unofficial implementations of block/layer-wise pruning methods for LLMs.☆78Apr 29, 2024Updated last year
- Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]☆91Sep 13, 2024Updated last year
- ☆131Oct 1, 2024Updated last year
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆164Apr 13, 2025Updated 11 months ago
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"☆57Oct 9, 2025Updated 6 months ago
- ☆15Apr 11, 2024Updated 2 years ago
- [ICML 2025] Predictive Data Selection: The Data That Predicts Is the Data That Teaches☆63Mar 4, 2025Updated last year
- Awesome list for LLM pruning.☆288Oct 11, 2025Updated 6 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆643Mar 4, 2024Updated 2 years ago
- [WIP🚧] 2025 up-to-date list of resources on visual tokenizers (primarily for visual generation). Give it a star 🌟 if you find it useful…☆20Jan 5, 2025Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- This repository contains data, code and models for contextual noncompliance.☆25Jul 18, 2024Updated last year
- [ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.☆27Apr 21, 2025Updated 11 months ago
- ☆54Nov 3, 2024Updated last year
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆50Oct 18, 2024Updated last year
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆33Nov 4, 2024Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆205Jul 17, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models☆126Mar 6, 2026Updated last month
- [ICML 2024] Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆39Feb 4, 2025Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆363Feb 5, 2026Updated 2 months ago
- ☆27May 13, 2025Updated 10 months ago
- ☆15Mar 12, 2024Updated 2 years ago
- Official repository of paper "Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models"☆24May 27, 2025Updated 10 months ago
- A simple and effective LLM pruning approach.☆860Aug 9, 2024Updated last year