kriskrisliu / PAT_Pruning-Aware-Tuning

☆14

Related projects ⓘ

Alternatives and complementary repositories for PAT_Pruning-Aware-Tuning

CASIA-IVA-Lab / FLAP
[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models
☆37Updated 10 months ago
raymin0223 / fast_robust_early_exit
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
☆53Updated last month
machilusZ / FastGen
This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
☆32Updated 3 months ago
kyegomez / FlashAttention20Triton
Triton implementation of Flash Attention2.0
☆22Updated last year
hdong920 / LESS
☆45Updated 6 months ago
yxli2123 / LoSparse
☆47Updated last year
kssteven418 / LTP
[KDD'22] Learned Token Pruning for Transformers
☆93Updated last year
VITA-Group / Random-MoE-as-Dropout
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…
☆44Updated last year
QingruZhang / PLATON
This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).
☆41Updated 2 years ago
yuchaoli / PST
Source code for IJCAI 2022 Long paper: Parameter-Efficient Sparsity for Large Language Models Fine-Tuning.
☆13Updated 2 years ago
hemingkx / SpecDec
Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)
☆33Updated 11 months ago
zyxxmu / DSnoT
Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…
☆37Updated 7 months ago
Lucky-Lance / SPP
[ICML 2024] SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
☆16Updated 5 months ago
kamanphoebe / Look-into-MoEs
A Closer Look into Mixture-of-Experts in Large Language Models
☆40Updated 3 months ago
RUCAIBox / QuantizedEmpirical
☆13Updated last year
song-wx / SIFT
[ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely
☆17Updated 4 months ago
WowCZ / LongMIT
LongMIT: Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets
☆34Updated last month
BaiTheBest / SparseLLM
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
☆38Updated this week
zhangsichengsjtu / AFPQ
AFPQ code implementation
☆18Updated last year
ArminAzizi98 / LaMDA
☆11Updated 2 weeks ago
sail-sg / regmix
🧬 RegMix: Data Mixture as Regression for Language Model Pre-training
☆88Updated last month
thunlp / Ouroboros
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)
☆76Updated last month
bojone / tiger
A Tight-fisted Optimizer
☆47Updated last year
thu-coai / MiniPLM
☆18Updated last week
ChaosCodes / ProPETL
One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning
☆38Updated last year
thunlp / MoEfication
☆108Updated 4 months ago
pprp / Pruner-Zero
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs
☆74Updated 5 months ago
YuchuanTian / RethinkTinyLM
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
☆118Updated 4 months ago
hdong920 / GRIFFIN
☆31Updated 2 months ago
ModelTC / QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…
☆36Updated 8 months ago