aiha-lab / Attention-Head-Pruning

Layer-wise Pruning of Transformer Heads for Efficient Language Modeling
21Updated 2 years ago

Related projects

Alternatives and complementary repositories for Attention-Head-Pruning