aiha-lab / Attention-Head-Pruning

Layer-wise Pruning of Transformer Heads for Efficient Language Modeling
19Updated 2 years ago

Related projects: