DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
☆63Aug 23, 2023Updated 2 years ago
Alternatives and similar repositories for dog
Users that are interested in dog are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 9 months ago
- Source code for "Taming GANs with Lookahead–Minmax", ICLR 2021.☆15Mar 28, 2021Updated 4 years ago
- ☆74Dec 7, 2024Updated last year
- ☆12Sep 26, 2019Updated 6 years ago
- Combining SOAP and MUON☆19Feb 11, 2025Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- ☆22Oct 12, 2023Updated 2 years ago
- ☆12Oct 5, 2020Updated 5 years ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆63Mar 11, 2025Updated last year
- ☆27Mar 21, 2024Updated 2 years ago
- Last-layer Laplace approximation code examples☆82Oct 18, 2021Updated 4 years ago
- [ICML2022] Training Your Sparse Neural Network Better with Any Mask. Ajay Jaiswal, Haoyu Ma, Tianlong Chen, ying Ding, and Zhangyang Wang☆30Jul 24, 2022Updated 3 years ago
- PyTorch implementation of efficient algorithms for DRO with CVaR and Chi-Square uncertainty sets☆64Oct 21, 2022Updated 3 years ago
- Code base for SRSGD.☆28Mar 5, 2020Updated 6 years ago
- ☆23Jun 18, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- PyTorch linear operators for curvature matrices (Hessian, Fisher/GGN, KFAC, ...)☆63Updated this week
- Benchmarking optimization methods on convex problems.☆33Aug 8, 2025Updated 7 months ago
- Self-Distillation with weighted ground-truth targets; ResNet and Kernel Ridge Regression☆19Oct 12, 2021Updated 4 years ago
- Code for paper Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks☆13Aug 9, 2022Updated 3 years ago
- Some Demo Code for the MPA Exercise.☆10Dec 4, 2017Updated 8 years ago
- Experimental version of jxbz/agd implementing support for bias terms, affine parameters, transformers, etc.☆12Jul 30, 2023Updated 2 years ago
- Model-agnostic posthoc calibration without distributional assumptions☆41Oct 20, 2023Updated 2 years ago
- Kyurae Kim's Awesome Reads☆20Nov 20, 2024Updated last year
- Wrap around any model to output differentially private prediction sets with finite sample validity on any dataset.☆18Mar 3, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- codebase release for EMNLP2023 paper publication☆19Sep 18, 2025Updated 6 months ago
- ☆37Oct 3, 2023Updated 2 years ago
- ☆13Jan 30, 2021Updated 5 years ago
- GeoZarr extension for OpenLayers☆12Jun 27, 2024Updated last year
- MDL Complexity computations and experiments from the paper "Revisiting complexity and the bias-variance tradeoff".☆18Jun 12, 2023Updated 2 years ago
- Code for the paper "Deep FTRL-ORW: An Efficient Deep Reinforcement Learning Algorithm for Solving Imperfect Information Extensive-Form Ga…☆11Dec 1, 2022Updated 3 years ago
- ☆16Jan 16, 2025Updated last year
- ☆13Jan 15, 2025Updated last year
- Official implementation for the paper "Controlled Sparsity via Constrained Optimization"☆11Aug 10, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆15Jun 5, 2023Updated 2 years ago
- ☆14Jul 6, 2021Updated 4 years ago
- Repo for "Smart Word Suggestions" (SWS) task and benchmark☆20Dec 4, 2023Updated 2 years ago
- Starter template for an online book or docs site made with Markdown and mdBook 🦀 📙☆13Nov 14, 2022Updated 3 years ago
- Pytorch classification with Cifar-10, Cifar-100, and STL-10☆14Jul 24, 2019Updated 6 years ago
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793☆453May 13, 2025Updated 10 months ago
- Minimal pretraining script for language modeling in PyTorch. Supporting torch compilation and DDP. It includes a model implementation and…☆43Mar 16, 2026Updated last week