datnnt1997 / multi-head_self-attentionLinks
A Faster Pytorch Implementation of Multi-Head Self-Attention
☆74Updated 3 years ago
Alternatives and similar repositories for multi-head_self-attention
Users that are interested in multi-head_self-attention are comparing it to the libraries listed below
Sorting:
- Multi-head attention in PyTorch☆152Updated 6 years ago
- This is a repository for Multi-task learning with toy data in Pytorch and Tensorflow☆136Updated 6 years ago
- My implementation of the gMLP model from the paper "Pay Attention to MLPs".☆25Updated 4 years ago
- my codes for learning attention mechanism☆50Updated 4 years ago
- Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification☆200Updated 4 years ago
- Experiments with supervised contrastive learning methods with different loss functions☆220Updated 2 years ago
- PyTorch implementation of some attentions for Deep Learning Researchers.☆532Updated 3 years ago
- Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms☆260Updated 4 years ago
- Independent implementation of Supervised Contrastive Loss. Straight to the point and beyond☆81Updated 4 years ago
- Pytorch implementation of Masked Auto-Encoder☆40Updated 3 years ago
- Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, Auxiliary Tasks in Multi-task Learning☆619Updated 4 years ago
- PyTorch implementation of some learning rate schedulers for deep learning researcher.☆90Updated 2 years ago
- PyTorch Implementation of the Multi-gate Mixture-of-Experts with Exclusivity (MMoEEx)☆32Updated 3 years ago
- Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch☆426Updated 3 years ago
- Squeeze and Excitation network implementation.☆18Updated 6 years ago
- PyTorch implementation of the models described in the IEEE ICASSP 2022 paper "Is cross-attention preferable to self-attention for multi-m…☆59Updated 2 months ago
- Custom loss functions to use in (mainly) PyTorch.☆39Updated 4 years ago
- A PyTorch Tutorials of Sentiment Analysis Classification (RNN, LSTM, Bi-LSTM, LSTM+Attention, CNN)☆313Updated 2 years ago
- Recent Advances in MLP-based Models (MLP is all you need!)☆115Updated 2 years ago
- Gluon implementation of channel-attention modules: SE, ECA, GCT☆40Updated 4 years ago
- Experimenting with different regression losses. Implemented in Pytorch.☆147Updated 6 years ago
- Implement the Pareto Optimizer and pcgrad to make a self-adaptive loss for multi-task☆47Updated 3 years ago
- PyTorch implementation of the GradNorm☆95Updated 9 months ago
- ☆146Updated 3 years ago
- MSc group project: Reproduction of 'Multi-Task Learning using Uncertainty to Weigh Losses for Scene Geometry and Semantics'; A. Kendall, …☆89Updated 5 years ago
- Sequencer: Deep LSTM for Image Classification☆143Updated 2 years ago
- Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision☆217Updated 4 years ago
- Pytorch implementation of the GradNorm. GradNorm addresses the problem of balancing multiple losses for multi-task learning by learning a…☆269Updated 2 years ago
- My take on a practical implementation of Linformer for Pytorch.☆414Updated 2 years ago
- Implementation of Transformer encoder in PyTorch☆66Updated 4 years ago