cybertronai/pytorch-lamb

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cybertronai/pytorch-lamb)

cybertronai / pytorch-lamb

Implementation of https://arxiv.org/abs/1904.00962

☆379

Alternatives and similar repositories for pytorch-lamb

Users that are interested in pytorch-lamb are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kakaobrain / torchlars
View on GitHub
A LARS implementation in PyTorch
☆353Feb 21, 2020Updated 6 years ago
noahgolmant / pytorch-lars
View on GitHub
"Layer-wise Adaptive Rate Scaling" in PyTorch
☆86Jan 22, 2021Updated 5 years ago
pytorch / contrib
View on GitHub
Implementations of ideas from recent papers
☆389Dec 22, 2020Updated 5 years ago
cybertronai / transformer-xl
View on GitHub
Training Transformer-XL on 128 GPUs
☆140Jun 11, 2020Updated 6 years ago
ymcui / LAMB_Optimizer_TF
View on GitHub
LAMB Optimizer for Large Batch Training (TensorFlow version)
☆122Jan 17, 2020Updated 6 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
jettify / pytorch-optimizer
View on GitHub
torch-optimizer -- collection of optimizers for Pytorch
☆3,168Mar 22, 2024Updated 2 years ago
LiyuanLucasLiu / RAdam
View on GitHub
On the Variance of the Adaptive Learning Rate and Beyond
☆2,548Jul 31, 2021Updated 4 years ago
facebookresearch / adaptive-span
View on GitHub
Transformer training code for sequential tasks
☆610Sep 14, 2021Updated 4 years ago
titu1994 / keras-LAMB-Optimizer
View on GitHub
Implementation of the LAMB optimizer for Keras from the paper "Reducing BERT Pre-Training Time from 3 Days to 76 Minutes"
☆75Apr 5, 2019Updated 7 years ago
lessw2020 / Ranger-Deep-Learning-Optimizer
View on GitHub
Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase
☆1,207Dec 22, 2023Updated 2 years ago
clovaai / AdamP
View on GitHub
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights (ICLR 2021)
☆412Jan 13, 2021Updated 5 years ago
KiriLev / albu_scheduler
View on GitHub
Image data augmentation scheduler for albumentations transforms
☆19May 1, 2021Updated 5 years ago
ternaus / iglovikov_segmentation
View on GitHub
Semantic segmentation pipeline using Catalyst.
☆20Apr 3, 2020Updated 6 years ago
harvardnlp / pytorch-struct
View on GitHub
Fast, general, and tested differentiable structured prediction in PyTorch
☆1,133Apr 20, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / fairscale
View on GitHub
PyTorch extensions for high performance and large scale training.
☆3,411Apr 26, 2025Updated last year
laiguokun / Funnel-Transformer
View on GitHub
☆220Jun 8, 2020Updated 6 years ago
lonePatient / lookahead_pytorch
View on GitHub
pytorch implement of Lookahead Optimizer
☆194Jun 20, 2022Updated 4 years ago
Santosh-Gupta / SpeedTorch
View on GitHub
Library for faster pinned CPU <-> GPU transfer in Pytorch
☆682Feb 21, 2020Updated 6 years ago
NVIDIA / apex
View on GitHub
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
☆8,986Updated this week
glample / fastBPE
View on GitHub
Fast BPE
☆677Jun 18, 2024Updated 2 years ago
asappresearch / sru
View on GitHub
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
☆2,107Jan 4, 2022Updated 4 years ago
facebookresearch / XLM
View on GitHub
PyTorch original implementation of Cross-lingual Language Model Pretraining.
☆2,923Feb 14, 2023Updated 3 years ago
LiyuanLucasLiu / Transformer-Clinic
View on GitHub
Understanding the Difficulty of Training Transformers
☆332May 31, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
NUS-HPC-AI-Lab / pytorch-lamb
View on GitHub
PyTorch implementation of LAMB for ImageNet/ResNet-50 training
☆13May 13, 2021Updated 5 years ago
clovaai / length-adaptive-transformer
View on GitHub
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)
☆102Nov 2, 2020Updated 5 years ago
Luolc / AdaBound
View on GitHub
An optimizer that trains as fast as Adam and as good as SGD.
☆2,903Jul 23, 2023Updated 3 years ago
awwong1 / torchprof
View on GitHub
PyTorch layer-by-layer model profiler
☆605May 23, 2021Updated 5 years ago
zihangdai / xlnet
View on GitHub
XLNet: Generalized Autoregressive Pretraining for Language Understanding
☆6,185May 28, 2023Updated 3 years ago
pyaf / parallel_mAP_evaluation
View on GitHub
This repo parallelizes mAP_evaluation using python's multiprocessing module.
☆18Apr 14, 2022Updated 4 years ago
mgrankin / over9000
View on GitHub
Over9000 optimizer
☆424Nov 22, 2022Updated 3 years ago
graykode / ALBERT-Pytorch
View on GitHub
Pytorch Implementation of ALBERT(A Lite BERT for Self-supervised Learning of Language Representations)
☆228Apr 7, 2021Updated 5 years ago
guolinke / TUPE
View on GitHub
Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve exis…
☆252Nov 8, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
NVIDIA / PyProf
View on GitHub
A GPU performance profiling tool for PyTorch models
☆510Jul 13, 2021Updated 5 years ago
adityaiitb / pyprof2
View on GitHub
PyProf2: PyTorch Profiling tool
☆82Jun 25, 2020Updated 6 years ago
kakaobrain / torchgpipe
View on GitHub
A GPipe implementation in PyTorch
☆865Jul 25, 2024Updated 2 years ago
cybertronai / imagenet18
View on GitHub
Train ImageNet in 18 minutes on AWS
☆134Mar 20, 2024Updated 2 years ago
facebookresearch / higher
View on GitHub
higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual tr…
☆1,629Mar 25, 2022Updated 4 years ago
PetrochukM / HParams
View on GitHub
Configure Python functions explicitly and safely
☆131Nov 18, 2024Updated last year
Stonesjtu / pytorch_memlab
View on GitHub
Profiling and inspecting memory in pytorch
☆1,078Jun 8, 2026Updated last month