Rishit-dagli / Fast-Transformer

An implementation of Fastformer: Additive Attention Can Be All You Need, a Transformer Variant in TensorFlow
149Updated 2 years ago

Related projects: