NydiaAI / g-mlp-tensorflow

A gMLP (gated MLP) implementation in Tensorflow 1.x, as described in the paper "Pay Attention to MLPs" (2105.08050).

☆16

Alternatives and similar repositories for g-mlp-tensorflow:

Users that are interested in g-mlp-tensorflow are comparing it to the libraries listed below

selimfirat / addml
Unsupervised Anomaly Detection via Deep Metric Learning with End-to-End Optimization
☆13Updated last year
thudzj / NEigenmaps
☆11Updated 10 months ago
calclavia / Performer-Pytorch
Pytorch implementation of Performer from the paper "Rethinking Attention with Performers".
☆24Updated 4 years ago
rishikksh20 / CoaT-pytorch
CoaT: Co-Scale Conv-Attentional Image Transformers
☆16Updated 3 years ago
leaderj1001 / Bag-of-MLP
Bag of MLP
☆20Updated 3 years ago
gbup-group / EAN-efficient-attention-network
The implementation of paper ''Efficient Attention Network: Accelerate Attention by Searching Where to Plug''.
☆20Updated last year
siat-nlp / IPRLS
Code and data for the SIGIR'2021 paper "Iterative Network Pruning with Uncertainty Regularization for Lifelong Sentiment Classification"
☆10Updated 3 years ago
cheneydon / efficient-bert
This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …
☆32Updated last year
jaketae / g-mlp
PyTorch implementation of Pay Attention to MLPs
☆40Updated 3 years ago
Extreme-classification / DECAF
DECAF: Deep Extreme Classification with Label Features
☆53Updated 2 years ago
lucasjinreal / wnnx_models
Various test models in WNNX format. It can view with `pip install wnetron && wnetron`
☆12Updated 2 years ago
Lifelong-ML / LASEM
Code for the ICML 2021 paper "Sharing Less is More: Lifelong Learning in Deep Networks with Selective Layer Transfer"
☆11Updated 3 years ago
MarziEd / SubSpace-Capsule-Network
☆13Updated 4 years ago
doerlbh / UnsupervisedAttentionMechanism
Code for our paper: "Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers".
☆21Updated 3 years ago
zhangqi-here / UnifiedEAE
A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational Information Bottleneck
☆10Updated 2 years ago
davidsvy / cosformer-pytorch
Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".
☆44Updated 3 years ago
ZidiXiu / ECRT
Supercharging Imbalanced Data Learning WithCausal Representation Transfer
☆12Updated 3 years ago
schwartz-lab-NLP / papa
Code for the PAPA paper
☆27Updated 2 years ago
bhanML / SIGUA
ICML'20: SIGUA: Forgetting May Make Learning with Noisy Labels More Robust
☆13Updated 4 years ago
facebookresearch / dmae_st
Directed masked autoencoders
☆14Updated last year
lucidrains / tableformer-pytorch
Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch
☆37Updated 2 years ago
Extreme-classification / ECLARE
ECLARE: Extreme Classification with Label Graph Correlations
☆41Updated 2 years ago
titu1994 / tf-sha-rnn
Tensorflow port implementation of Single Headed Attention RNN
☆16Updated 4 years ago
CyndxAI / QKNorm
Code for the paper "Query-Key Normalization for Transformers"
☆36Updated 3 years ago
jwoongkim11 / QA-RAG
Code for the paper, From RAG to QA-RAG: Integrating Generative AI for Pharmaceutical Regulatory Compliance Process
☆13Updated 4 months ago
jaketae / fnet
PyTorch implementation of FNet: Mixing Tokens with Fourier transforms
☆25Updated 3 years ago
Devin-Taylor / MultiAug
Multi-modal data augmentation for machine learning
☆16Updated 5 years ago
google-research / noisy-fewshot-learning
☆23Updated 4 years ago