fabio-deep / ReZero-ResNetLinks

Unofficial pytorch implementation of ReZero in ResNet

☆23

Alternatives and similar repositories for ReZero-ResNet

Users that are interested in ReZero-ResNet are comparing it to the libraries listed below

Sorting:

vinbhaskara / adams
Exploiting Uncertainty of Loss Landscape for Stochastic Optimization
☆15Updated 6 years ago
noahgolmant / pytorch-lr-dropout
"Learning Rate Dropout" in PyTorch
☆34Updated 5 years ago
uclaml / Padam
Partially Adaptive Momentum Estimation method in the paper "Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep …
☆39Updated 2 years ago
xternalz / SDPoint
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks
☆18Updated 5 years ago
thomasbrandon / swish-torch
Swish Activation - PyTorch CUDA Implementation
☆37Updated 5 years ago
BayesWatch / pytorch-blockswap
Code for BlockSwap (ICLR 2020).
☆33Updated 4 years ago
lucidrains / kronecker-attention-pytorch
Implementation of Kronecker Attention in Pytorch
☆19Updated 4 years ago
ducha-aiki / LSUV-pytorch
Simple implementation of the LSUV initialization in PyTorch
☆58Updated last year
eladhoffer / norm_matters
☆23Updated 6 years ago
moskomule / shampoo.pytorch
An implementation of shampoo
☆77Updated 7 years ago
ppwwyyxx / FRN-on-common-ImageNet-baseline
Filter Response Normalization tested on better ImageNet baselines.
☆35Updated 5 years ago
znxlwm / pytorch-apex-experiment
Simple experiment of Apex (A PyTorch Extension)
☆47Updated 5 years ago
shivram1987 / diffGrad
diffGrad: An Optimization Method for Convolutional Neural Networks
☆55Updated 2 years ago
lnsmith54 / BOSS
This repository provides the code for replicating the experiments in the paper "Building One-Shot Semi-supervised (BOSS) Learning up to F…
☆36Updated 4 years ago
Yonghongwei / Advanced-optimizer-with-Gradient-Centralization
Advanced optimizer with Gradient-Centralization
☆21Updated 4 years ago
sebastiani / pytorch-attention-augmented-convolution
A pytorch implementation of https://arxiv.org/abs/1904.09925
☆16Updated 6 years ago
MerHS / SASA-pytorch
Unofficial implementation of Stand-Alone Self-Attention in Vision Models (obsolete)
☆44Updated 6 years ago
sayakpaul / Training-BatchNorm-and-Only-BatchNorm
Experiments with the ideas presented in https://arxiv.org/abs/2003.00152 by Frankle et al.
☆29Updated 4 years ago
NVlabs / unas
Official implementation of "UNAS: Differentiable Architecture Search Meets Reinforcement Learning", CVPR 2020 Oral
☆61Updated last year
WarBean / emp
Easy Multiprocessing for Python
☆43Updated 4 years ago
NVlabs / iccv2019-mixed-precision-tutorial
☆28Updated 5 years ago
vacancy / AdvancedIndexing-PyTorch
(Batched) advanced indexing for PyTorch.
☆53Updated 7 months ago
lucidrains / hamburger-pytorch
Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than Matrix Decomposition"
☆99Updated 4 years ago
sayakpaul / Adaptive-Gradient-Clipping
Minimal implementation of adaptive gradient clipping (https://arxiv.org/abs/2102.06171) in TensorFlow 2.
☆85Updated 4 years ago
tbachlechner / ReZero-examples
PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"
☆61Updated last year
mkolod / fast_upsampling
☆33Updated last year
mingxingtan / mnasnet
MnasNet snapshot
☆35Updated 6 years ago
shellysheynin / Locally-SAG-Transformer
Official Pytorch implementation of the paper: "Locally Shifted Attention With Early Global Integration"
☆15Updated 3 years ago
minhtannguyen / SRSGD
Code base for SRSGD.
☆28Updated 5 years ago
buttomnutstoast / Multigrid-Neural-Architectures
Multigrid Neural Architecture
☆30Updated 7 years ago