maum-ai / pnlp-mixer
Unofficial PyTorch Implementation for pNLP-Mixer: an Efficient all-MLP Architecture for Language (https://arxiv.org/abs/2202.04350)
☆63Updated 3 years ago
Alternatives and similar repositories for pnlp-mixer:
Users that are interested in pnlp-mixer are comparing it to the libraries listed below
- Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Residual Quantization"☆103Updated 2 years ago
- A PyTorch Implementation of the Luna: Linear Unified Nested Attention☆41Updated 3 years ago
- Official code for Wav2Seq☆96Updated 2 years ago
- Implementation of Fast Transformer in Pytorch☆173Updated 3 years ago
- Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch☆73Updated 2 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 2 years ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆99Updated 2 years ago
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆125Updated 3 years ago
- ResiDual: Transformer with Dual Residual Connections, https://arxiv.org/abs/2304.14802☆93Updated last year
- Unofficial PyTorch implementation of Google's FNet: Mixing Tokens with Fourier Transforms. With checkpoints.☆73Updated 2 years ago
- Relative Positional Encoding for Transformers with Linear Complexity☆62Updated 2 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆225Updated 2 years ago
- PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI☆176Updated last year
- ICASSP 2023 Accepted☆189Updated 10 months ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆60Updated 2 years ago
- Sequence modeling with Mega.☆295Updated 2 years ago
- Implementation of the GBST block from the Charformer paper, in Pytorch☆116Updated 3 years ago
- Unofficial PyTorch implementation of "Step-unrolled Denoising Autoencoders for Text Generation"☆24Updated 2 years ago
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆112Updated 2 years ago
- Implementation of RealFormer using pytorch☆101Updated 4 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆48Updated 3 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆204Updated last year
- Learning Features with Parameter-Free Layers, ICLR 2022☆85Updated last year
- Axial Positional Embedding for Pytorch☆76Updated last month
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆226Updated 6 months ago
- Implementation of NWT, audio-to-video generation, in Pytorch☆88Updated 3 years ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆57Updated last year
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆60Updated 3 years ago
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30Updated 2 years ago
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆37Updated 3 years ago