Qualcomm-AI-research / llm-surgeonView external linksLinks
☆35May 24, 2024Updated last year
Alternatives and similar repositories for llm-surgeon
Users that are interested in llm-surgeon are comparing it to the libraries listed below
Sorting:
- ☆30Jul 22, 2024Updated last year
- Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…☆50Apr 9, 2024Updated last year
- [ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference☆46Jun 4, 2024Updated last year
- [ICML 2024] Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆39Feb 4, 2025Updated last year
- Are gradient information useful for pruning of LLMs?☆47Aug 23, 2025Updated 5 months ago
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆52Oct 30, 2025Updated 3 months ago
- [ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better☆16Feb 15, 2025Updated last year
- The handbook for leading Applied AI teams☆13Jan 23, 2026Updated 3 weeks ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.☆23Aug 6, 2025Updated 6 months ago
- LLM training in simple, raw C/CUDA☆15Dec 5, 2024Updated last year
- Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)☆67Mar 27, 2025Updated 10 months ago
- ☆43Oct 31, 2024Updated last year
- [ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".☆23Mar 16, 2025Updated 10 months ago
- ☆23Nov 26, 2024Updated last year
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆25May 29, 2025Updated 8 months ago
- This repo contains the code for studying the interplay between quantization and sparsity methods☆26Feb 26, 2025Updated 11 months ago
- [ICLR'25] ARB-LLM: Alternating Refined Binarizations for Large Language Models☆28Aug 5, 2025Updated 6 months ago
- ☆56Jun 10, 2024Updated last year
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- ☆23Apr 18, 2022Updated 3 years ago
- [NeurIPS 2024] AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models☆31Jun 9, 2025Updated 8 months ago
- ☆63Oct 17, 2023Updated 2 years ago
- Implementation of KDR-Agent, the AAAI 2025 accepted paper, focusing on knowledge-driven reasoning for autonomous agents.☆16Nov 24, 2025Updated 2 months ago
- [ICLR 2025] Official PyTorch Implementation for CPE: Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Ga…☆12Apr 7, 2025Updated 10 months ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆32Mar 5, 2024Updated last year
- [ICRA 2024] WLST: Weak Labels Guided Self-training for Weakly-supervised Domain Adaptation on 3D Object Detection☆12Feb 6, 2024Updated 2 years ago
- ☆40Nov 22, 2025Updated 2 months ago