Separius / awesome-fast-attentionLinks

list of efficient attention modules

☆1,013

Alternatives and similar repositories for awesome-fast-attention

Users that are interested in awesome-fast-attention are comparing it to the libraries listed below

Sorting:

lartpang / PyTorchTricks
Some tricks of pytorch...
☆1,194Updated last year
epfml / attention-cnn
Source code for "On the Relationship between Self-Attention and Convolutional Layers"
☆1,106Updated 2 years ago
tczhangzhi / pytorch-distributed
A quickstart and benchmark for pytorch distributed training.
☆1,668Updated last year
zasdfgbnm / TorchSnooper
Debug PyTorch code using PySnooper
☆797Updated 4 years ago
ildoonet / pytorch-gradual-warmup-lr
Gradually-Warmup Learning Rate Scheduler for PyTorch
☆991Updated 9 months ago
clcarwin / focal_loss_pytorch
A PyTorch Implementation of Focal Loss.
☆985Updated 5 years ago
lessw2020 / Ranger-Deep-Learning-Optimizer
Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase
☆1,202Updated last year
lhyfst / knowledge-distillation-papers
knowledge distillation papers
☆758Updated 2 years ago
Oldpan / Pytorch-Memory-Utils
pytorch memory track code
☆1,017Updated 4 years ago
dropreg / R-Drop
☆881Updated last year
ranandalon / mtl
Unofficial implementation of: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics
☆555Updated 3 years ago
HobbitLong / PyContrast
PyTorch implementation of Contrastive Learning methods
☆1,988Updated last year
CoinCheung / pytorch-loss
label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful
☆2,245Updated 9 months ago
demuxin / pytorch_tricks
some tircks for PyTorch
☆576Updated 5 years ago
asheeshcric / awesome-contrastive-self-supervised-learning
A comprehensive list of awesome contrastive self-supervised learning papers.
☆1,284Updated 10 months ago
Lyken17 / Efficient-PyTorch
My best practice of training large dataset using PyTorch.
☆1,100Updated last year
FLHonker / Awesome-Knowledge-Distillation
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
☆2,614Updated 2 years ago
sacmehta / delight
DeLighT: Very Deep and Light-Weight Transformers
☆470Updated 4 years ago
lucidrains / lambda-networks
Implementation of LambdaNetworks, a new approach to image recognition that reaches SOTA with less compute
☆1,533Updated 4 years ago
lucidrains / performer-pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
☆1,142Updated 3 years ago
tatp22 / linformer-pytorch
My take on a practical implementation of Linformer for Pytorch.
☆417Updated 3 years ago
mit-han-lab / lite-transformer
[ICLR 2020] Lite Transformer with Long-Short Range Attention
☆612Updated last year
lucidrains / mlp-mixer-pytorch
An All-MLP solution for Vision, from Google AI
☆1,034Updated 3 weeks ago
lucidrains / reformer-pytorch
Reformer, the efficient Transformer, in Pytorch
☆2,177Updated 2 years ago
subeeshvasu / Awesome-Learning-with-Label-Noise
A curated list of resources for Learning with Noisy Labels
☆2,694Updated 3 months ago
nmhkahn / torchsummaryX
torchsummaryX: Improved visualization tool of torchsummary
☆303Updated 3 years ago
Mikoto10032 / AutomaticWeightedLoss
Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, Auxiliary Tasks in Multi-task Learning
☆633Updated 5 years ago
majumderb / rezero
Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"
☆410Updated last year
idiap / fast-transformers
Pytorch library for fast transformer implementations
☆1,725Updated 2 years ago
Lyken17 / pytorch-memonger
Sublinear memory optimization for deep learning. https://arxiv.org/abs/1604.06174
☆599Updated 5 years ago