formll / dogLinks
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
☆64Updated 2 years ago
Alternatives and similar repositories for dog
Users that are interested in dog are comparing it to the libraries listed below
Sorting:
- ☆62Updated last year
- Parameter-Free Optimizers for Pytorch☆130Updated last year
- Omnigrok: Grokking Beyond Algorithmic Data☆62Updated 2 years ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆37Updated 3 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆92Updated 2 years ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆191Updated 3 weeks ago
- ☆33Updated last year
- ☆238Updated last year
- ☆73Updated last year
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆42Updated 2 years ago
- Transformers with doubly stochastic attention☆53Updated 3 years ago
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆63Updated 4 years ago
- Parallelizing non-linear sequential models over the sequence length☆56Updated 7 months ago
- A State-Space Model with Rational Transfer Function Representation.☆83Updated last year
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆40Updated 2 years ago
- Deep Networks Grok All the Time and Here is Why☆38Updated last year
- ☆52Updated last month
- A Python package for generating concise, high-quality summaries of a probability distribution☆57Updated last week
- Agustinus' very opiniated publication-ready plotting library☆70Updated 8 months ago
- [ICML 2024] SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)☆24Updated last year
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆127Updated 2 years ago
- IVON optimizer for neural networks based on variational learning.☆80Updated last year
- Running Jax in PyTorch Lightning☆119Updated last year
- LoRA for arbitrary JAX models and functions☆144Updated last year
- ☆35Updated last year
- Euclidean Wasserstein-2 optimal transportation☆47Updated 2 years ago
- ☆27Updated 3 years ago
- Distributed K-FAC preconditioner for PyTorch☆94Updated this week
- ☆31Updated 10 months ago
- Official PyTorch implementation of NeuralSVD (ICML 2024)☆22Updated last year