lucidrains / hamburger-pytorchView external linksLinks
Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than Matrix Decomposition"
☆99Jan 13, 2021Updated 5 years ago
Alternatives and similar repositories for hamburger-pytorch
Users that are interested in hamburger-pytorch are comparing it to the libraries listed below
Sorting:
- ☆22May 3, 2022Updated 3 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Apr 6, 2022Updated 3 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"☆32Dec 16, 2020Updated 5 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆49Nov 30, 2021Updated 4 years ago
- Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)☆16Oct 11, 2021Updated 4 years ago
- ☆12Sep 26, 2019Updated 6 years ago
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆67Jan 10, 2023Updated 3 years ago
- ☆182Feb 23, 2023Updated 2 years ago
- ICLR 2021 (spotlight): Graph Convolution with Low-rank Learnable Local Filters☆16Jan 14, 2021Updated 5 years ago
- Code for the ICML 2021 paper "Sharing Less is More: Lifelong Learning in Deep Networks with Selective Layer Transfer"☆12Aug 17, 2021Updated 4 years ago
- Implementation of the Remixer Block from the Remixer paper, in Pytorch☆36Sep 27, 2021Updated 4 years ago
- Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch☆59Mar 19, 2021Updated 4 years ago
- BoundarySqueeze: Image Segmentation as Boundary Squeezing☆56Apr 9, 2022Updated 3 years ago
- [NeurIPS 2021] SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning☆63Mar 10, 2023Updated 2 years ago
- Neural Arithmetic Logic Units by Trask et al.☆12Apr 10, 2019Updated 6 years ago
- huggingface ChineseBert Tokenizer☆16Apr 16, 2022Updated 3 years ago
- Codebase for the paper "Beyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning"☆17Jul 12, 2021Updated 4 years ago
- ☆31Dec 20, 2022Updated 3 years ago
- [ECCV'20 Oral] MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution☆159Oct 4, 2022Updated 3 years ago
- Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation (CVPR 2020)☆202Sep 25, 2020Updated 5 years ago
- [NeurIPS'20] Learning Semantic-aware Normalization for Generative Adversarial Networks☆53May 14, 2021Updated 4 years ago
- Official PyTorch Implementation of aLRP Loss [NeurIPS2020]☆138Dec 17, 2020Updated 5 years ago
- Deep Learning Research☆16Nov 13, 2019Updated 6 years ago
- Pytorch implementation of Performer from the paper "Rethinking Attention with Performers".☆25Oct 5, 2020Updated 5 years ago
- [ICCVW'21 Best Paper Award] All you need are a few pixels: semantic segmentation with PixelPick☆67Jun 18, 2022Updated 3 years ago
- Code accompanying the NeurIPS 2020 submission "Teaching a GAN What Not to Learn."☆32Sep 3, 2021Updated 4 years ago
- [CVPR 2020] Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation☆92Nov 22, 2022Updated 3 years ago
- Gradient Origin Networks - a new type of generative model that is able to quickly learn a latent representation without an encoder☆160Feb 4, 2021Updated 5 years ago
- Implementation of DropLoss for Long-Tail Instance Segmentation in Pytorch☆42Apr 14, 2021Updated 4 years ago
- To be a next-generation DL-based phenotype prediction from genome mutations.☆19May 17, 2021Updated 4 years ago
- Code for the ICML 2021 and ICLR 2022 papers: Skew Orthogonal Convolutions, Improved deterministic l2 robustness on CIFAR-10 and CIFAR-100☆18Feb 20, 2022Updated 3 years ago
- AutoML Two-Sample Test☆19Aug 3, 2022Updated 3 years ago
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- DeLighT: Very Deep and Light-Weight Transformers☆469Oct 16, 2020Updated 5 years ago
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆611Jul 11, 2024Updated last year
- RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder☆210Mar 18, 2021Updated 4 years ago
- Learning Features with Parameter-Free Layers, ICLR 2022☆84May 3, 2023Updated 2 years ago