bdusell / stack-attention

Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
16Updated 7 months ago

Related projects

Alternatives and complementary repositories for stack-attention