The official implementation of the paper "Uncovering the Redundancy in Transformers via a Unified Study of Layer Dropping (TMLR)".
☆191Apr 23, 2026Updated last month
Alternatives and similar repositories for LLM-Drop
Users that are interested in LLM-Drop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Source code of ACL 2023 Main Conference Paper "PAD-Net: An Efficient Framework for Dynamic Networks".☆14Feb 28, 2026Updated 3 months ago
- Source code of EMNLP 2022 Findings paper "SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters"☆23Feb 28, 2026Updated 3 months ago
- The open-source Mixture of Depths code and the official implementation of the paper "Router-Tuning: A Simple and Effective Approach for E…☆31May 12, 2026Updated last month
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆46Feb 28, 2026Updated 3 months ago
- The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".☆90Feb 28, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The official implementation of the paper "Rethinking Pruning for Vision-Language Models: Strategies for Effective Sparsity".☆17Jul 2, 2024Updated last year
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆30Jul 24, 2025Updated 10 months ago
- ☆16Jul 23, 2024Updated last year
- Pytorch Code for FedHyper☆11Aug 28, 2024Updated last year
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆156Apr 7, 2025Updated last year
- Code for "ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models" (ICLR 2024)☆21Feb 16, 2024Updated 2 years ago
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆61Feb 7, 2025Updated last year
- ☆14Aug 18, 2022Updated 3 years ago
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆23Mar 4, 2025Updated last year
- ☆21Feb 10, 2025Updated last year
- Unofficial implementations of block/layer-wise pruning methods for LLMs.☆78Apr 29, 2024Updated 2 years ago
- Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]☆90Sep 13, 2024Updated last year
- ☆131Oct 1, 2024Updated last year
- ☆42Oct 31, 2024Updated last year
- (ACL2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆36May 28, 2025Updated last year
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆166Apr 13, 2025Updated last year
- ☆15Apr 11, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICML 2025] Predictive Data Selection: The Data That Predicts Is the Data That Teaches☆64Mar 4, 2025Updated last year
- Awesome list for LLM pruning.☆298Oct 11, 2025Updated 8 months ago
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"☆63Oct 9, 2025Updated 8 months ago
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆643Mar 4, 2024Updated 2 years ago
- [WIP🚧] 2025 up-to-date list of resources on visual tokenizers (primarily for visual generation). Give it a star 🌟 if you find it useful…☆20Jan 5, 2025Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- ☆55Nov 3, 2024Updated last year
- This repository contains data, code and models for contextual noncompliance.☆26Jul 18, 2024Updated last year
- [ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.☆27Apr 21, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆51Oct 18, 2024Updated last year
- [ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations☆19Oct 18, 2025Updated 7 months ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆33Nov 4, 2024Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆205Jul 17, 2024Updated last year
- [TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models☆125Mar 6, 2026Updated 3 months ago
- [ICML 2024] Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆41Feb 4, 2025Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆371Apr 13, 2026Updated last month