optimizer & lr scheduler & loss function collections in PyTorch
☆413May 20, 2026Updated last week
Alternatives and similar repositories for pytorch_optimizer
Users that are interested in pytorch_optimizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- torch-optimizer -- collection of optimizers for Pytorch☆3,167Mar 22, 2024Updated 2 years ago
- Prodigy and Schedule-Free, together at last.☆94Sep 27, 2025Updated 8 months ago
- stochastic bfloat16 based optimizer library☆21Dec 4, 2024Updated last year
- A collection of niche / personally useful PyTorch optimizers with modified code.☆28Apr 14, 2026Updated last month
- The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”☆1,003Jan 30, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch☆2,186Nov 27, 2024Updated last year
- D-Adaptation for SGD, Adam and AdaGrad☆532Jan 22, 2025Updated last year
- Ranger deep learning optimizer rewrite to use newest components☆342Mar 17, 2026Updated 2 months ago
- ☆22Jan 23, 2024Updated 2 years ago
- ☆268Dec 2, 2024Updated last year
- Testing various improvements to Ranger21 for 2022☆19Nov 6, 2024Updated last year
- Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models☆817Jun 8, 2025Updated 11 months ago
- An optimizer that trains as fast as Adam and as good as SGD in Tensorflow☆46May 4, 2019Updated 7 years ago
- Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training☆40May 4, 2026Updated 3 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Efficient kernel for RMS normalization with fused operations, includes both forward and backward passes, compatibility with PyTorch.☆13Jun 5, 2024Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆103Dec 22, 2024Updated last year
- Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models☆13Mar 9, 2024Updated 2 years ago
- Collection of the latest, greatest, deep learning optimizers (for Pytorch) - CNN, NLP suitable☆218Apr 4, 2021Updated 5 years ago
- Fast, Modern, and Low Precision PyTorch Optimizers☆129May 16, 2026Updated last week
- Collect optimizer related papers, data, repositories☆101Apr 5, 2026Updated last month
- 7th place solution to RecSys Challenge 2023 by Corca☆11Jan 8, 2024Updated 2 years ago
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793☆458May 13, 2025Updated last year
- ☆23Jan 5, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆13Nov 7, 2024Updated last year
- PyTorch Implementation of Variance Reduced Optimization Algorithms -- SARAH and SVRG.☆15Jul 11, 2021Updated 4 years ago
- The AdEMAMix Optimizer: Better, Faster, Older.☆188Sep 12, 2024Updated last year
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆137May 5, 2026Updated 3 weeks ago
- APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper Honorable Mention☆359Nov 29, 2025Updated 5 months ago
- SAM: Sharpness-Aware Minimization (PyTorch)☆1,975Feb 21, 2024Updated 2 years ago
- Schedule-Free Optimization in PyTorch☆2,284May 18, 2026Updated last week
- Efficient optimizers☆326May 13, 2026Updated 2 weeks ago
- TorchOpt is an efficient library for differentiable optimization built upon PyTorch.☆631May 4, 2026Updated 3 weeks ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆35Dec 5, 2022Updated 3 years ago
- A simple way to keep track of an Exponential Moving Average (EMA) version of your Pytorch model☆650Dec 19, 2025Updated 5 months ago
- Parameter-Free Optimizers for Pytorch☆132Apr 23, 2024Updated 2 years ago
- Multidimensional indexing for tensors☆140Jul 17, 2023Updated 2 years ago
- ☆11Nov 8, 2023Updated 2 years ago
- Amos optimizer with JEstimator lib.☆83May 15, 2024Updated 2 years ago
- Permutation invariant training in PyTorch☆13Oct 2, 2020Updated 5 years ago