lucidrains / hamburger-pytorch
Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than Matrix Decomposition"
☆98Updated 4 years ago
Alternatives and similar repositories for hamburger-pytorch:
Users that are interested in hamburger-pytorch are comparing it to the libraries listed below
- Unofficial PyTorch Implementation of EvoNorm☆121Updated 3 years ago
- Unofficial PyTorch implementation of "Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Ne…☆22Updated 5 years ago
- Implementation of various Vision Transformers I found interesting☆84Updated 3 years ago
- [NeurIPS'20] GradAug: A New Regularization Method for Deep Neural Networks☆93Updated 4 years ago
- ☆47Updated 4 years ago
- Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch☆57Updated 4 years ago
- ☆49Updated 5 years ago
- PyTorch implementation of Lambda Network and pretrained Lambda-ResNet☆54Updated 4 years ago
- (CVPR 2020) This repo contains code for "PADS: Policy-Adapted Sampling for Visual Similarity Learning", which proposes learnable triplet …☆60Updated 4 years ago
- MoEx (Moment Exchange)☆141Updated 3 years ago
- Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones☆198Updated 4 years ago
- Code for "Are labels necessary for neural architecture search"☆92Updated last year
- A ShuffleBatchNorm layer to shuffle BatchNorm statistics across multiple GPUs☆56Updated 3 years ago
- The implementation of "Shape Adaptor: A Learnable Resizing Module" [ECCV 2020].☆73Updated 4 years ago
- Full implementation of the paper "Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator".☆101Updated 5 years ago
- PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆62Updated 8 months ago
- ☆182Updated 2 years ago
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆118Updated 3 years ago
- ☆62Updated 4 years ago
- ☆92Updated 4 years ago
- Pytorch implementation of Learning Rate Dropout.☆42Updated 5 years ago
- Sparse Switchable Normalization with sparse activation function SparestMax☆64Updated 5 years ago
- Unofficial implementation of Stand-Alone Self-Attention in Vision Models (obsolete)☆44Updated 5 years ago
- ContextLab: A Toolbox for Context Feature Augmentation developed with PyTorch☆39Updated 5 years ago
- Pytorch implementation of CVPR2021 paper: SuperMix: Supervising the Mixing Data Augmentation☆92Updated 3 years ago
- Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]☆88Updated 3 years ago
- AttentiveNorm_Detection(built on Open MMLab Detection Toolbox and Benchmark)☆22Updated 4 years ago
- Code for reproducing experiments in "How Useful is Self-Supervised Pretraining for Visual Tasks?"☆60Updated 8 months ago
- [WACV 2022] "Sandwich Batch Normalization: A Drop-In Replacement for Feature Distribution Heterogeneity" by Xinyu Gong, Wuyang Chen, Tian…☆50Updated 3 years ago
- [ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845☆119Updated 3 years ago