Skumarr53 / Attention-is-All-you-Need-PyTorch
Repo has PyTorch implementation "Attention is All you Need - Transformers" paper for Machine Translation from French queries to English.
☆65Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for Attention-is-All-you-Need-PyTorch
- Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch☆70Updated 4 years ago
- Implementation of Feedback Transformer in Pytorch☆104Updated 3 years ago
- An education step by step implementation of SimCLR that accompanies the blogpost☆32Updated 2 years ago
- ☆76Updated 4 years ago
- A simple to use pytorch wrapper for contrastive self-supervised learning on any neural network☆123Updated 3 years ago
- [ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845☆119Updated 3 years ago
- A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"☆71Updated last year
- Official PyTorch implementation of the paper "Self-Supervised Relational Reasoning for Representation Learning", NeurIPS 2020 Spotlight.☆143Updated 7 months ago
- A PyTorch implementation of Transformer in "Attention is All You Need"☆103Updated 3 years ago
- Code for Multi-Head Attention: Collaborate Instead of Concatenate☆150Updated last year
- Pytorch-tensorboard simple tutorial and example for a beginner☆23Updated 4 years ago
- Graph neural network message passing reframed as a Transformer with local attention☆66Updated last year
- MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space☆40Updated 3 years ago
- Implementation of Fast Transformer in Pytorch☆171Updated 3 years ago
- Code for "Finetuning Pretrained Transformers into Variational Autoencoders"☆37Updated 2 years ago
- code for the ddp tutorial☆32Updated 2 years ago
- Axial Positional Embedding for Pytorch☆60Updated 3 years ago
- ☆47Updated 3 years ago
- A guide that integrates Pytorch DistributedDataParallel, Apex, warmup, learning rate scheduler, also mentions the set-up of early-stoppin…☆60Updated 2 years ago
- Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."☆134Updated 3 years ago
- An implementation of masked language modeling for Pytorch, made as concise and simple as possible☆177Updated last year
- Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch☆55Updated 3 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆106Updated 4 years ago
- Fully featured implementation of Routing Transformer☆284Updated 3 years ago
- Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)☆73Updated 4 years ago
- Pytorch implementation of Compressive Transformers, from Deepmind☆155Updated 3 years ago
- PyTorch implementation of Pay Attention to MLPs☆39Updated 3 years ago
- RNN Encoder-Decoder in PyTorch☆42Updated 3 months ago
- A non-JIT version implementation / replication of CLIP of OpenAI in pytorch☆34Updated 3 years ago
- [ICML 2020] code for the flooding regularizer proposed in "Do We Need Zero Training Loss After Achieving Zero Training Error?"☆92Updated last year