Implementation of https://arxiv.org/abs/1904.00962
☆377Dec 9, 2020Updated 5 years ago
Alternatives and similar repositories for pytorch-lamb
Users that are interested in pytorch-lamb are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A LARS implementation in PyTorch☆353Feb 21, 2020Updated 6 years ago
- "Layer-wise Adaptive Rate Scaling" in PyTorch☆87Jan 22, 2021Updated 5 years ago
- Training Transformer-XL on 128 GPUs☆141Jun 11, 2020Updated 5 years ago
- Implementations of ideas from recent papers☆391Dec 22, 2020Updated 5 years ago
- torch-optimizer -- collection of optimizers for Pytorch☆3,170Mar 22, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- LAMB Optimizer for Large Batch Training (TensorFlow version)☆121Jan 17, 2020Updated 6 years ago
- On the Variance of the Adaptive Learning Rate and Beyond☆2,551Jul 31, 2021Updated 4 years ago
- Transformer training code for sequential tasks☆609Sep 14, 2021Updated 4 years ago
- Implementation of the LAMB optimizer for Keras from the paper "Reducing BERT Pre-Training Time from 3 Days to 76 Minutes"☆75Apr 5, 2019Updated 7 years ago
- Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase☆1,206Dec 22, 2023Updated 2 years ago
- AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights (ICLR 2021)☆416Jan 13, 2021Updated 5 years ago
- Image data augmentation scheduler for albumentations transforms☆19May 1, 2021Updated 4 years ago
- Fast, general, and tested differentiable structured prediction in PyTorch☆1,128Apr 20, 2022Updated 3 years ago
- Semantic segmentation pipeline using Catalyst.☆20Apr 3, 2020Updated 6 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- PyTorch extensions for high performance and large scale training.☆3,405Apr 26, 2025Updated 11 months ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆102Nov 2, 2020Updated 5 years ago
- pytorch implement of Lookahead Optimizer☆195Jun 20, 2022Updated 3 years ago
- ☆220Jun 8, 2020Updated 5 years ago
- Library for faster pinned CPU <-> GPU transfer in Pytorch☆683Feb 21, 2020Updated 6 years ago
- Fast BPE☆678Jun 18, 2024Updated last year
- A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch☆8,949Updated this week
- Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)☆2,113Jan 4, 2022Updated 4 years ago
- PyTorch original implementation of Cross-lingual Language Model Pretraining.☆2,930Feb 14, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- PyTorch implementation of LAMB for ImageNet/ResNet-50 training☆13May 13, 2021Updated 4 years ago
- Understanding the Difficulty of Training Transformers☆332May 31, 2022Updated 3 years ago
- An optimizer that trains as fast as Adam and as good as SGD.☆2,907Jul 23, 2023Updated 2 years ago
- PyTorch layer-by-layer model profiler☆606May 23, 2021Updated 4 years ago
- XLNet: Generalized Autoregressive Pretraining for Language Understanding☆6,177May 28, 2023Updated 2 years ago
- Over9000 optimizer☆424Nov 22, 2022Updated 3 years ago
- This repo parallelizes mAP_evaluation using python's multiprocessing module.☆18Apr 14, 2022Updated 4 years ago
- Pytorch Implementation of ALBERT(A Lite BERT for Self-supervised Learning of Language Representations)☆228Apr 7, 2021Updated 5 years ago
- A GPipe implementation in PyTorch☆862Jul 25, 2024Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A GPU performance profiling tool for PyTorch models☆511Jul 13, 2021Updated 4 years ago
- Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve exis…☆253Nov 8, 2021Updated 4 years ago
- Slides from various talks I gave☆18Oct 25, 2018Updated 7 years ago
- higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual tr…☆1,628Mar 25, 2022Updated 4 years ago
- PyProf2: PyTorch Profiling tool☆82Jun 25, 2020Updated 5 years ago
- Configure Python functions explicitly and safely☆130Nov 18, 2024Updated last year
- Profiling and inspecting memory in pytorch☆1,078Sep 5, 2025Updated 7 months ago