lukemelas / do-you-even-need-attentionLinks

Is the attention layer even necessary? (https://arxiv.org/abs/2105.02723)

☆485

Alternatives and similar repositories for do-you-even-need-attention

Users that are interested in do-you-even-need-attention are comparing it to the libraries listed below

Sorting:

benjs / nfnets_pytorch
Pre-trained NFNets with 99% of the accuracy of the official paper "High-Performance Large-Scale Image Recognition Without Normalization".
☆160Updated 4 years ago
vballoli / nfnets-pytorch
NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch. Find explanation at tourdeml.github.io/blog/
☆349Updated last year
NVIDIA / transformer-ls
Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).
☆228Updated 3 years ago
facebookresearch / suncet
Code to reproduce the results in the FAIR research papers "Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting V…
☆489Updated 2 years ago
mlpen / Nystromformer
☆383Updated 2 years ago
kakaobrain / torchlars
A LARS implementation in PyTorch
☆352Updated 5 years ago
lucidrains / transformer-in-transformer
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorc…
☆309Updated 3 years ago
frgfm / torch-scan
Seamless analysis of your PyTorch models (RAM usage, FLOPs, MACs, receptive field, etc.)
☆222Updated 7 months ago
lucidrains / routing-transformer
Fully featured implementation of Routing Transformer
☆296Updated 3 years ago
1adrianb / pytorch-estimate-flops
Estimate/count FLOPS for a given neural network using pytorch
☆306Updated 3 years ago
locuslab / convmixer
Implementation of ConvMixer for "Patches Are All You Need? 🤷"
☆1,077Updated 2 years ago
pabloppp / pytorch-tools
Useful PyTorch functions and modules that are not implemented in PyTorch by default
☆188Updated last year
facebookresearch / convit
Code for the Convolutional Vision Transformer (ConViT)
☆469Updated 4 years ago
SHI-Labs / Compact-Transformers
Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)
☆536Updated 11 months ago
microsoft / vision-longformer
☆249Updated 3 years ago
google-research / reassessed-imagenet
Labels and other data for the paper "Are we done with ImageNet?"
☆194Updated 3 years ago
michaelsdr / momentumnet
Drop-in replacement for any ResNet with a significantly reduced memory footprint and better representation capabilities
☆208Updated last year
imbue-ai / self_supervised
A Pytorch-Lightning implementation of self-supervised algorithms
☆546Updated 3 years ago
lucidrains / sinkhorn-transformer
Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention
☆268Updated 4 years ago
lessw2020 / Best-Deep-Learning-Optimizers
Collection of the latest, greatest, deep learning optimizers (for Pytorch) - CNN, NLP suitable
☆217Updated 4 years ago
lucidrains / glom-pytorch
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up proc…
☆194Updated 4 years ago
lucidrains / halonet-pytorch
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones
☆200Updated 4 years ago
microsoft / esvit
EsViT: Efficient self-supervised Vision Transformers
☆412Updated 2 years ago
pytorch / nestedtensor
[Prototype] Tools for the concurrent manipulation of variably sized Tensors.
☆251Updated 2 years ago
rish-16 / aft-pytorch
Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.
☆243Updated 3 years ago
lessw2020 / Ranger21
Ranger deep learning optimizer rewrite to use newest components
☆338Updated last year
facebookresearch / luckmatters
Understanding Training Dynamics of Deep ReLU Networks
☆300Updated last week
lucidrains / g-mlp-pytorch
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch
☆430Updated 4 years ago
digantamisra98 / EvoNorm
Unofficial PyTorch Implementation of EvoNorm
☆122Updated 4 years ago
michaelrzhang / lookahead
Implementation for the Lookahead Optimizer.
☆243Updated 3 years ago