google / bi-tempered-lossLinks

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences. https://arxiv.org/pdf/1906.03361.pdf

☆147

Alternatives and similar repositories for bi-tempered-loss

Users that are interested in bi-tempered-loss are comparing it to the libraries listed below

Sorting:

sdoria / SimpleSelfAttention
A simpler version of the self-attention layer from SAGAN, and some image classification results.
☆214Updated 6 years ago
OverLordGoldDragon / keras-adamw
Keras/TF implementation of AdamW, SGDW, NadamW, Warm Restarts, and Learning Rate multipliers
☆169Updated 3 years ago
lessw2020 / mish
Mish Deep Learning Activation Function for PyTorch / FastAI
☆161Updated 5 years ago
lessw2020 / Ranger-Mish-ImageWoof-5
Repo to build on / reproduce the record breaking Ranger-Mish-SelfAttention setup on FastAI ImageWoof dataset 5 epochs
☆116Updated 6 years ago
oval-group / smooth-topk
Smooth Loss Functions for Deep Top-k Classification
☆259Updated 4 years ago
michaelrzhang / lookahead
Implementation for the Lookahead Optimizer.
☆242Updated 3 years ago
sgugger / Adam-experiments
Experiments with Adam/AdamW/amsgrad
☆201Updated 7 years ago
pytorch / contrib
Implementations of ideas from recent papers
☆392Updated 4 years ago
lonePatient / lookahead_pytorch
pytorch implement of Lookahead Optimizer
☆195Updated 3 years ago
podgorskiy / DareBlopy
Data Reading Blocks for Python
☆104Updated 4 years ago
alphadl / lookahead.pytorch
lookahead optimizer (Lookahead Optimizer: k steps forward, 1 step back) for pytorch
☆337Updated 6 years ago
galatolofederico / pytorch-balanced-batch
A pytorch dataset sampler for always sampling balanced batches.
☆118Updated 4 years ago
loshchil / AdamW-and-SGDW
Decoupled Weight Decay Regularization (ICLR 2019)
☆285Updated 6 years ago
digantamisra98 / EvoNorm
Unofficial PyTorch Implementation of EvoNorm
☆123Updated 4 years ago
taki0112 / RAdam-Tensorflow
Simple Tensorflow implementation of "On The Variance Of The Adaptive Learning Rate And Beyond"
☆97Updated 5 years ago
mpyrozhok / adamwr
Implements https://arxiv.org/abs/1711.05101 AdamW optimizer, cosine learning rate scheduler and "Cyclical Learning Rates for Training Neu…
☆153Updated 6 years ago
kakaobrain / autoclint
A specially designed light version of Fast AutoAugment
☆171Updated 5 years ago
szymonmaszke / torchfunc
PyTorch functions and utilities to make your life easier
☆194Updated 4 years ago
egg-west / AdamW-pytorch
Implementation and experiments for AdamW on Pytorch
☆94Updated 6 years ago
taki0112 / AdaBound-Tensorflow
Simple Tensorflow implementation of "Adaptive Gradient Methods with Dynamic Bound of Learning Rate" (ICLR 2019)
☆150Updated 6 years ago
titu1994 / keras-adabound
Keras implementation of AdaBound
☆130Updated 6 years ago
noahgolmant / pytorch-lars
"Layer-wise Adaptive Rate Scaling" in PyTorch
☆87Updated 4 years ago
tbachlechner / ReZero-examples
PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"
☆62Updated last year
moskomule / homura
homura is a library for fast prototyping DL research
☆106Updated 3 years ago
Vermeille / Torchelie
Torchélie is a set of utility functions, layers, losses, models, trainers and other things for PyTorch.
☆110Updated 3 months ago
eladhoffer / utils.pytorch
Utilities for Pytorch
☆88Updated 3 years ago
Cohere-Labs-Community / Targeted-Dropout
Complementary code for the Targeted Dropout paper
☆255Updated 6 years ago
negrinho / deep_architect
A general, modular, and programmable architecture search framework
☆123Updated 2 years ago
mgrankin / over9000
Over9000 optimizer
☆425Updated 3 years ago
AlexiaJM / MaximumMarginGANs
Code for paper: "Support Vector Machines, Wasserstein's distance and gradient-penalty GANs maximize a margin"
☆179Updated 5 years ago