A block pruning framework for LLMs.
☆28May 17, 2025Updated last year
Alternatives and similar repositories for BlockPruner
Users that are interested in BlockPruner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆23Nov 26, 2024Updated last year
- Official implementation for LaCo (EMNLP 2024 Findings)☆21Oct 3, 2024Updated last year
- [ICML 2024] Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆41Feb 4, 2025Updated last year
- Adversarial Attack for Pre-trained Code Models☆10Jul 19, 2022Updated 3 years ago
- MedConceptsQA: Open source medical concepts QA benchmark☆18Dec 30, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Exploration☆17Mar 18, 2026Updated 2 months ago
- Official implementation of the ICLR paper "Streamlining Redundant Layers to Compress Large Language Models"☆43May 1, 2025Updated last year
- [ICML24] Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs☆100Nov 25, 2024Updated last year
- Source code of NeurIPS 2022 accepted paper "AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning"☆23Oct 12, 2022Updated 3 years ago
- [NeurIPS 2024] Search for Efficient LLMs☆17Jan 16, 2025Updated last year
- DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling☆38Jul 12, 2024Updated last year
- Backdooring Neural Code Search☆14Sep 8, 2023Updated 2 years ago
- [ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models☆118May 24, 2024Updated last year
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆267Apr 23, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ACL 2023] TeAST: Temporal Knowledge Graph Embedding via Archimedean Spiral Timeline☆12Mar 4, 2024Updated 2 years ago
- [CVPR 2026] STAMP: Better, Stronger, Faster: Tackling the Trilemma in MLLM-based Segmentation with Simultaneous Textual Mask Prediction☆37Feb 21, 2026Updated 2 months ago
- A simple and effective LLM pruning approach.☆863Aug 9, 2024Updated last year
- [NAACL 2025 Main Selected Oral] Repository for the paper: Prompt Compression for Large Language Models: A Survey☆37May 18, 2025Updated last year
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆68Mar 27, 2025Updated last year
- NAACL2021-Temporal Knowledge Graph Completion using a Linear TemporalRegularizer and Multivector Embeddings☆20Jan 27, 2022Updated 4 years ago
- Source code for the COLING 2022 paper "KGE-CL: Contrastive Learning of Tensor Decomposition Based Knowledge Graph Embeddings".☆24Oct 29, 2022Updated 3 years ago
- Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic☆32Feb 18, 2026Updated 3 months ago
- An unofficial re-implementation of Graph Structure of Neural Networks (Jiaxuan You · Kaiming He · Jure Leskovec · Saining Xie) ICML 2020☆10Jul 27, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆30Jul 22, 2024Updated last year
- [CVPR '24] Official implementation of the paper "Multiflow: Shifting Towards Task-Agnostic Vision-Language Pruning".☆23Mar 7, 2025Updated last year
- [NeurIPS 2024] AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models☆34Jun 9, 2025Updated 11 months ago
- ☆63Dec 15, 2024Updated last year
- A fork of the PEFT library, supporting Robust Adaptation (RoSA)☆15Aug 16, 2024Updated last year
- ☆14Feb 26, 2025Updated last year
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated 3 months ago
- 🤗An unofficial PyTorch implementation of ConvBert based on huggingface/transformers.☆17Oct 6, 2022Updated 3 years ago
- Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"☆81Jul 7, 2025Updated 10 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Pytorch code of [CVPR 2023] "NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction".☆11Mar 14, 2023Updated 3 years ago
- Multi-dimensional analysis of orthogonal safety directions in LLM alignment☆22Mar 20, 2025Updated last year
- Official implementation of "ConViS-Bench: Estimating Video Similarity Through Semantic Concepts", NeurIPS 2025☆26Nov 28, 2025Updated 5 months ago
- ☆11Jan 21, 2021Updated 5 years ago
- This is the official repo for "Differentiable Model Scaling using Differentiable Topk"☆12May 16, 2024Updated 2 years ago
- DatasetResearch: Benchmarking Agent Systems for Demand-Driven Dataset Discovery☆20Sep 24, 2025Updated 7 months ago
- Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)☆45Feb 13, 2024Updated 2 years ago