google / bi-tempered-loss
Robust Bi-Tempered Logistic Loss Based on Bregman Divergences. https://arxiv.org/pdf/1906.03361.pdf
☆148Updated 3 years ago
Alternatives and similar repositories for bi-tempered-loss:
Users that are interested in bi-tempered-loss are comparing it to the libraries listed below
- Implementations of ideas from recent papers☆393Updated 4 years ago
- Smooth Loss Functions for Deep Top-k Classification☆252Updated 3 years ago
- Semi-supervised ImageNet1K models☆243Updated 5 years ago
- Over9000 optimizer☆426Updated 2 years ago
- Unofficial PyTorch Implementation of EvoNorm☆121Updated 3 years ago
- A simpler version of the self-attention layer from SAGAN, and some image classification results.☆212Updated 5 years ago
- Source code for the paper "Divide and Conquer the Embedding Space for Metric Learning", CVPR 2019☆264Updated 5 years ago
- pytorch implement of Lookahead Optimizer☆189Updated 2 years ago
- A TensorFlow re-implementation of Momentum Contrast (MoCo): https://arxiv.org/abs/1911.05722☆161Updated last year
- Mish Deep Learning Activation Function for PyTorch / FastAI☆161Updated 5 years ago
- lookahead optimizer (Lookahead Optimizer: k steps forward, 1 step back) for pytorch☆335Updated 5 years ago
- Tensorflow code for Differentiable architecture search☆72Updated 6 years ago
- Standardizing weights to accelerate micro-batch training☆547Updated 3 years ago
- Utilities for Pytorch☆89Updated 2 years ago
- Implementation and experiments for AdamW on Pytorch☆93Updated 5 years ago
- Using the CLR algorithm for training (https://arxiv.org/abs/1506.01186)☆108Updated 6 years ago
- homura is a library for fast prototyping DL research☆107Updated 2 years ago
- A New Optimization Technique for Deep Neural Networks☆535Updated 3 years ago
- Decoupled Weight Decay Regularization (ICLR 2019)☆273Updated 6 years ago
- Experiments with Adam/AdamW/amsgrad☆200Updated 6 years ago
- Accelerate training by storing parameters in one contiguous chunk of memory.☆291Updated 4 years ago
- Official Pytorch Implementation of "TResNet: High-Performance GPU-Dedicated Architecture" (WACV 2021)☆473Updated 4 months ago
- Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem [CVPR 2019, oral]☆182Updated 5 years ago
- Repo to build on / reproduce the record breaking Ranger-Mish-SelfAttention setup on FastAI ImageWoof dataset 5 epochs☆116Updated 5 years ago
- Complementary code for the Targeted Dropout paper☆255Updated 5 years ago
- Implementation for the Lookahead Optimizer.☆240Updated 2 years ago
- Scaling and Benchmarking Self-Supervised Visual Representation Learning☆586Updated 3 years ago
- A pytorch dataset sampler for always sampling balanced batches.☆114Updated 4 years ago
- Useful PyTorch functions and modules that are not implemented in PyTorch by default☆187Updated 11 months ago
- Unofficial PyTorch Implementation of Unsupervised Data Augmentation.☆146Updated 4 years ago