vulus98 / Rethinking-attention

My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
40Updated 11 months ago

Related projects

Alternatives and complementary repositories for Rethinking-attention