Code for the paper "Are Sixteen Heads Really Better than One?"
☆175Apr 1, 2020Updated 6 years ago
Alternatives and similar repositories for are-16-heads-really-better-than-1
Users that are interested in are-16-heads-really-better-than-1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is a repository with the code for the ACL 2019 paper "Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, t…☆319Aug 2, 2021Updated 4 years ago
- Transformer training code for sequential tasks☆609Sep 14, 2021Updated 4 years ago
- ☆472Apr 4, 2021Updated 5 years ago
- ⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).