LukasHedegaard / continual-transformers
Official Pytorch Implementation for "Continual Transformers: Redundancy-Free Attention for Online Inference" [ICLR 2023]
☆28Updated last year
Alternatives and similar repositories for continual-transformers:
Users that are interested in continual-transformers are comparing it to the libraries listed below
- A Python library for Continual Inference Networks in PyTorch☆49Updated last month
- Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".☆44Updated 3 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆115Updated 4 years ago
- Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…☆50Updated 2 years ago
- Implementation of Multistream Transformers in Pytorch☆53Updated 3 years ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆60Updated 3 years ago
- Pytorch implementation of Performer from the paper "Rethinking Attention with Performers".☆25Updated 4 years ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19Updated 2 weeks ago
- Representation learning for NLP @ JSALT19☆38Updated 4 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Updated last year
- Codes for paper <InteL-VAEs: Adding Inductive Biases to VariationalAuto-Encoders via Intermediary Latents>.☆19Updated 3 years ago
- VIsually-Pivoted Audio and(N) Text☆22Updated 2 years ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆99Updated 2 years ago
- ☆21Updated 2 years ago
- Tensorflow Implementation of "Theory and Experiments on Vector Quantized Autoencoders"☆14Updated 6 years ago
- ☆74Updated 4 years ago
- A Pytorch Implementations for Various Vector Quantization Methods☆28Updated 3 years ago
- A PyTorch implementation of SimSiam based on CVPR 2021 paper "Exploring Simple Siamese Representation Learning"☆10Updated 4 years ago
- Implementations of various linear RNN layers using pytorch and triton☆49Updated last year
- Code for the paper PermuteFormer☆42Updated 3 years ago
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Updated 2 years ago
- PyTorch implementation of FNet: Mixing Tokens with Fourier transforms☆26Updated 3 years ago
- This is the public github for our paper "Transformer with a Mixture of Gaussian Keys"☆26Updated 2 years ago
- Official implementation of OSSGAN [CVPR 2022]☆21Updated 2 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 3 years ago
- STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION☆16Updated 6 years ago
- [ICLR 2023] “ Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Better Representations”, Ziyu Jian…☆24Updated 2 years ago
- WeightNet: Revisiting the Design Space of Weight Networks☆19Updated 4 years ago
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆118Updated 3 years ago
- ☆27Updated 9 months ago