juntang-zhuang / Adabelief-Optimizer
Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"
☆1,050Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for Adabelief-Optimizer
- Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase☆1,193Updated 10 months ago
- Implementation of LambdaNetworks, a new approach to image recognition that reaches SOTA with less compute☆1,532Updated 3 years ago
- torch-optimizer -- collection of optimizers for Pytorch☆3,040Updated 7 months ago
- On the Variance of the Adaptive Learning Rate and Beyond☆2,536Updated 3 years ago
- Pytorch library for fast transformer implementations☆1,642Updated last year
- An implementation of Performer, a linear attention-based transformer, in Pytorch☆1,093Updated 2 years ago
- Source code for "On the Relationship between Self-Attention and Convolutional Layers"☆1,085Updated last year
- Reformer, the efficient Transformer, in Pytorch☆2,116Updated last year
- Code for Noisy Student Training. https://arxiv.org/abs/1911.04252☆752Updated 3 years ago
- The Official PyTorch Implementation of "NVAE: A Deep Hierarchical Variational Autoencoder" (NeurIPS 2020 spotlight paper)☆1,023Updated last year
- Fast, differentiable sorting and ranking in PyTorch☆775Updated 10 months ago
- A learning rate range test implementation in PyTorch☆922Updated last month
- Standalone TFRecord reader/writer with PyTorch data loaders☆863Updated 2 months ago
- AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty☆980Updated 3 months ago
- Toolbox of models, callbacks, and datasets for AI/ML researchers.☆1,691Updated this week
- Ranger deep learning optimizer rewrite to use newest components☆322Updated 8 months ago
- Over9000 optimizer☆425Updated last year
- A New Optimization Technique for Deep Neural Networks☆532Updated 2 years ago
- Deep Learning Experiment Management☆639Updated last year
- [NeurIPS‘2021] "TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up", Yifan Jiang, Shiyu Chang, Zhangyang Wang☆1,642Updated 2 years ago
- An All-MLP solution for Vision, from Google AI☆1,001Updated last month
- NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch. Find explanation at tourdeml.github.io/blog/☆345Updated 9 months ago
- Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆406Updated 3 months ago
- [NeurIPS'19] Deep Equilibrium Models☆727Updated 2 years ago
- Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch☆1,092Updated last year
- lookahead optimizer (Lookahead Optimizer: k steps forward, 1 step back) for pytorch☆334Updated 5 years ago
- Code snippets created for the PyTorch discussion board☆543Updated 3 years ago
- higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual tr…☆1,589Updated 2 years ago
- Transformer based on a variant of attention that is linear complexity in respect to sequence length☆695Updated 6 months ago
- Profiling and inspecting memory in pytorch☆1,018Updated 3 months ago