MrGGLS / BlockPruner
A block pruning framework for LLMs.
☆22Updated 10 months ago
Alternatives and similar repositories for BlockPruner
Users that are interested in BlockPruner are comparing it to the libraries listed below
Sorting:
- ☆18Updated 5 months ago
- Official implementation for LaCo (EMNLP 2024 Findings)☆16Updated 7 months ago
- Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…☆47Updated last year
- ☆15Updated 6 months ago
- [ICLR 2025] The official pytorch implement of "Dynamic Low-Rank Sparse Adaptation for Large Language Models".☆18Updated 2 months ago
- Official implementation of the ICLR paper "Streamlining Redundant Layers to Compress Large Language Models"☆26Updated 2 weeks ago
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆47Updated 2 months ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆141Updated 2 months ago
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆45Updated 7 months ago
- Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆37Updated 3 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆67Updated 3 months ago
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆58Updated last month
- [ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference☆39Updated 11 months ago