Rishit-dagli / Fast-Transformer

An implementation of Fastformer: Additive Attention Can Be All You Need, a Transformer Variant in TensorFlow
150Updated 2 years ago

Related projects

Alternatives and complementary repositories for Fast-Transformer