LukasHedegaard / continual-transformers
Official Pytorch Implementation for "Continual Transformers: Redundancy-Free Attention for Online Inference" [ICLR 2023]
☆28Updated last year
Alternatives and similar repositories for continual-transformers:
Users that are interested in continual-transformers are comparing it to the libraries listed below
- A Python library for Continual Inference Networks in PyTorch☆49Updated 2 weeks ago
- Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".☆44Updated 3 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆115Updated 4 years ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19Updated last week
- A variant of Transformer-XL where the memory is updated not with a queue, but with attention☆48Updated 4 years ago
- Pytorch implementation of Performer from the paper "Rethinking Attention with Performers".☆25Updated 4 years ago
- Code for the paper PermuteFormer☆42Updated 3 years ago
- Codes for paper <InteL-VAEs: Adding Inductive Biases to VariationalAuto-Encoders via Intermediary Latents>.☆19Updated 3 years ago
- STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION☆16Updated 6 years ago
- A PyTorch implementation of SimSiam based on CVPR 2021 paper "Exploring Simple Siamese Representation Learning"☆10Updated 4 years ago
- VIsually-Pivoted Audio and(N) Text☆22Updated 2 years ago
- PyTorch implementation of Pay Attention to MLPs☆40Updated 3 years ago
- Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch☆52Updated 4 years ago
- ☆23Updated 4 years ago
- Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch☆57Updated 4 years ago
- This is the public github for our paper "Transformer with a Mixture of Gaussian Keys"☆26Updated 2 years ago
- Implementations of various linear RNN layers using pytorch and triton☆50Updated last year
- Representation learning for NLP @ JSALT19☆38Updated 4 years ago
- Relative Positional Encoding for Transformers with Linear Complexity☆62Updated 2 years ago
- ☆75Updated 4 years ago
- Tensorflow Implementation of "Theory and Experiments on Vector Quantized Autoencoders"☆14Updated 6 years ago
- Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…☆50Updated 2 years ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆60Updated 2 years ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆64Updated 11 months ago
- Discriminative Prototypes learned by Dynamic Time Warping (DTW) for Time Series Classification (TSC)☆32Updated 4 years ago
- A Pytorch Implementations for Various Vector Quantization Methods☆28Updated 3 years ago
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Updated 3 years ago
- ☆27Updated 8 months ago
- Code for the paper: Audio-Visual Model Distillation Using Acoustic Images☆20Updated 2 years ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆99Updated 2 years ago