lucidrains / Mega-pytorch

Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
203Updated last year

Related projects

Alternatives and complementary repositories for Mega-pytorch