Tony-Y / pytorch_warmupLinks

Learning Rate Warmup in PyTorch

☆410

Alternatives and similar repositories for pytorch_warmup

Users that are interested in pytorch_warmup are comparing it to the libraries listed below

Sorting:

katsura-jp / pytorch-cosine-annealing-with-warmup
☆464Updated 2 years ago
fadel / pytorch_ema
Tiny PyTorch library for maintaining a moving average of a collection of parameters.
☆434Updated 10 months ago
ildoonet / pytorch-gradual-warmup-lr
Gradually-Warmup Learning Rate Scheduler for PyTorch
☆991Updated 9 months ago
lucidrains / mlp-mixer-pytorch
An All-MLP solution for Vision, from Google AI
☆1,034Updated 3 weeks ago
locuslab / convmixer
Implementation of ConvMixer for "Patches Are All You Need? 🤷"
☆1,076Updated 2 years ago
AdeelH / pytorch-multi-class-focal-loss
An (unofficial) implementation of Focal Loss, as described in the RetinaNet paper, generalized to the multi-class case.
☆236Updated last year
vballoli / nfnets-pytorch
NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch. Find explanation at tourdeml.github.io/blog/
☆347Updated last year
lucidrains / transformer-in-transformer
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorc…
☆305Updated 3 years ago
wzlxjtu / PositionalEncoding2D
A PyTorch implementation of the 1d and 2d Sinusoidal positional encoding/embedding.
☆253Updated 4 years ago
PistonY / torch-toolbox
🛠 Toolbox to extend PyTorch functionalities
☆421Updated last year
Fangyh09 / pytorch-receptive-field
Compute CNN receptive field size in pytorch in one line
☆361Updated last year
xxxnell / how-do-vits-work
(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
☆819Updated 3 years ago
yangkky / distributed_tutorial
☆261Updated 5 years ago
lucidrains / axial-attention
Implementation of Axial attention - attending to multi-dimensional data efficiently
☆384Updated 3 years ago
SHI-Labs / Compact-Transformers
Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)
☆536Updated 8 months ago
SwinTransformer / Transformer-SSL
This is an official implementation for "Self-Supervised Learning with Swin Transformers".
☆658Updated 4 years ago
lucidrains / linformer
Implementation of Linformer for Pytorch
☆294Updated last year
vahidk / tfrecord
Standalone TFRecord reader/writer with PyTorch data loaders
☆888Updated 2 months ago
rishikksh20 / MLP-Mixer-pytorch
Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision
☆218Updated 4 years ago
cmsflash / efficient-attention
An implementation of the efficient attention module.
☆320Updated 4 years ago
lukemelas / do-you-even-need-attention
Is the attention layer even necessary? (https://arxiv.org/abs/2105.02723)
☆486Updated 4 years ago
nmhkahn / torchsummaryX
torchsummaryX: Improved visualization tool of torchsummary
☆303Updated 3 years ago
kakaobrain / torchlars
A LARS implementation in PyTorch
☆349Updated 5 years ago
facebookresearch / convit
Code for the Convolutional Vision Transformer (ConViT)
☆466Updated 3 years ago
clcarwin / focal_loss_pytorch
A PyTorch Implementation of Focal Loss.
☆986Updated 5 years ago
lucidrains / linear-attention-transformer
Transformer based on a variant of attention that is linear complexity in respect to sequence length
☆789Updated last year
rishikksh20 / FNet-pytorch
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
☆259Updated 4 years ago
ildoonet / pytorch-randaugment
Unofficial PyTorch Reimplementation of RandAugment.
☆636Updated 2 years ago
tatp22 / multidim-positional-encoding
An implementation of 1D, 2D, and 3D positional encoding in Pytorch and TensorFlow
☆600Updated 9 months ago
chinhsuanwu / coatnet-pytorch
A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes"
☆387Updated 3 years ago