shoaibahmed/llm_depth_pruning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shoaibahmed/llm_depth_pruning)

shoaibahmed / llm_depth_pruning

Official implementation of the paper: "A deeper look at depth pruning of LLMs"

☆15

Alternatives and similar repositories for llm_depth_pruning

Users that are interested in llm_depth_pruning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

XMUDeepLIT / QGC
View on GitHub
Code for "Retaining Key Information under High Compression Rates: Query-Guided Compressor for LLMs" (ACL 2024)
☆20Jun 12, 2024Updated 2 years ago
DRSY / KV_Compression
View on GitHub
[EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens
☆25Nov 6, 2023Updated 2 years ago
csyhhu / Co-Prune
View on GitHub
Codes for accepted paper "Cooperative Pruning in Cross-Domain Deep Neural Network Compression" in IJCAI 2019.
☆12Aug 15, 2019Updated 6 years ago
google-research / trc
View on GitHub
☆13Jan 27, 2023Updated 3 years ago
FranxYao / Retrieval-Head-with-Flash-Attention
View on GitHub
Efficient retrieval head analysis with triton flash attention that supports topK probability
☆13Jun 15, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
haolibai / APS-channel-search
View on GitHub
Revisiting Parameter Sharing for Automatic Neural Channel Number Search, NeurIPS 2020
☆21Nov 15, 2020Updated 5 years ago
eth-lre / LLM_ICL
View on GitHub
ACL24
☆11Jun 7, 2024Updated 2 years ago
houlu369 / Loss-aware-Binarization
View on GitHub
Implementation of ICLR 2017 paper "Loss-aware Binarization of Deep Networks"
☆20Feb 24, 2019Updated 7 years ago
Infini-AI-Lab / S2FT
View on GitHub
☆19Jan 3, 2025Updated last year
abuyukcakir / adversarial-training-survey
View on GitHub
Papers, sites and slides for Adversarial Training
☆17Jun 30, 2020Updated 6 years ago
nota-github / ERGO
View on GitHub
ERGO (Efficient Reasoning & Guided Observation) is a large vision-language model trained with reinforcement learning on efficiency object…
☆19Feb 25, 2026Updated 5 months ago
machilusZ / FastGen
View on GitHub
This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
☆44Aug 14, 2024Updated last year
BayesWatch / tf-variational-dropout
View on GitHub
Sparsifying Variational Dropout in Tensorflow
☆22Nov 3, 2017Updated 8 years ago
hobinkwak / ExpectedGradients_IntegratedGradients_pytorch
View on GitHub
simple implementation of Expected Gradients and Integrated Gradients by pytorch
☆12May 11, 2022Updated 4 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
vdlad / Remarkable-Robustness-of-LLMs
View on GitHub
Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"
☆20Jun 11, 2025Updated last year
HarlynDN / WebCiteS
View on GitHub
[ACL'24] WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations
☆13Sep 11, 2024Updated last year
Elvin-Yiming-Du / Memory-T1
View on GitHub
This respository is used for time reasoning task for mult-session dialogue system.
☆16Feb 7, 2026Updated 5 months ago
Nota-NetsPresso / shortened-llm
View on GitHub
Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]
☆90Sep 13, 2024Updated last year
Winsleo / General-parallel-genetic-algorithm-on-GPU
View on GitHub
基于CUDA的GPU加速通用遗传算法实现，实验平台为Nvidia Jetson Nano
☆13Mar 23, 2023Updated 3 years ago
ttwthomas / nanogpt
View on GitHub
fork of karparthy's nanogpt with custom datasets
☆11Jul 25, 2023Updated 2 years ago
tajanthan / pmf
View on GitHub
Proximal Mean-field for Neural Network Quantization
☆21Apr 9, 2020Updated 6 years ago
NVIDIA-AI-IOT / mmj_utils
View on GitHub
A utility library to help integrate Python applications with Metropolis Microservices for Jetson
☆16Dec 21, 2024Updated last year
ruikangliu / IntactKV
View on GitHub
[ACL 2024] Official PyTorch implementation of "IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact"
☆46May 24, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
hyperrixel / infinitybatch
View on GitHub
PyTorch tool for training with bigger batch size on the GPU
☆11Feb 26, 2021Updated 5 years ago
WindyLee0822 / CTG
View on GitHub
Source code of “Reinforcement Learning with Token-level Feedback for Controllable Text Generation (NAACL 2024)
☆17Dec 8, 2024Updated last year
Luckfort / CD
View on GitHub
[COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
☆82Jan 22, 2025Updated last year
lanl / pyDNMFk
View on GitHub
Python Distributed Non Negative Matrix Factorization with custom clustering
☆25Aug 22, 2023Updated 2 years ago
AdelWang / KD-CoT
View on GitHub
☆15Apr 22, 2024Updated 2 years ago
WangWenhao0716 / PDF-Embedding
View on GitHub
[NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"
☆18Oct 1, 2024Updated last year
RuishanLiu / GAN-TSC
View on GitHub
☆11Oct 15, 2020Updated 5 years ago
GATECH-EIC / SuperTickets
View on GitHub
[ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning
☆20Jul 7, 2022Updated 4 years ago
mlwu22 / RED
View on GitHub
Implementation code for ACL2024：Advancing Parameter Efficiency in Fine-tuning via Representation Editing
☆15Apr 20, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
hemingkx / SWIFT
View on GitHub
[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
☆70Feb 21, 2025Updated last year
dilab-zju / self-speculative-decoding
View on GitHub
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
☆230Feb 13, 2025Updated last year
VITA-Group / EarlyBERT
View on GitHub
[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, …
☆18Dec 30, 2021Updated 4 years ago
ay27 / RandomGit
View on GitHub
随机扒取古诗文词语作为git的commit msg
☆11Jan 16, 2017Updated 9 years ago
hrtan / MoSo
View on GitHub
[NeurIPS-2023] The PyTorch Implementation of MoSo. The algorithms are based on our paper: "Data Pruning via Moving-one-Sample-out". MoSo …
☆10May 21, 2026Updated 2 months ago
djcrw / Supervised-Predictive-Coding
View on GitHub
Implementation for "An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Pl…
☆16Oct 10, 2018Updated 7 years ago
zb2313 / DB-frontend
View on GitHub
同济大学2019级数据库课程设计项目
☆11Sep 11, 2021Updated 4 years ago