formll / dogLinks

DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule

☆63

Alternatives and similar repositories for dog

Users that are interested in dog are comparing it to the libraries listed below

Sorting:

bremen79 / parameterfree
Parameter-Free Optimizers for Pytorch
☆130Updated last year
shikaiqiu / compute-better-spent
☆61Updated last year
KindXiaoming / Omnigrok
Omnigrok: Grokking Beyond Algorithmic Data
☆62Updated 2 years ago
lixilinx / psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…
☆188Updated last month
edwardjhu / TP4
Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)
☆63Updated 4 years ago
michaelsdr / sinkformers
Transformers with doubly stochastic attention
☆50Updated 3 years ago
nikhilvyas / SOAP
☆223Updated 11 months ago
AndPotap / einsum-search
☆33Updated last year
thjashin / multires-conv
Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)
☆127Updated 2 years ago
vvvm23 / mamba-jax
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆89Updated last year
stanislavfort / dissect-git-re-basin
Replicating and dissecting the git-re-basin project in one-click-replication Colabs
☆35Updated 3 years ago
Ping-C / optimizer
This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…
☆40Updated 2 years ago
locuslab / edge-of-stability
☆72Updated 11 months ago
google-deepmind / eigengame
Open source code for EigenGame.
☆33Updated 2 years ago
LIONS-EPFL / scion
☆47Updated last month
microsoft / goodpoints
A Python package for generating concise, high-quality summaries of a probability distribution
☆54Updated 2 weeks ago
davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆143Updated last year
f-dangel / singd
[ICML 2024] SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)
☆23Updated last year
google-deepmind / conformal_training
This repository contains a Jax implementation of conformal training corresponding to the ICLR'22 paper "learning optimal conformal classi…
☆130Updated 3 years ago
tml-epfl / why-weight-decay
Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]
☆68Updated last year
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆110Updated 2 weeks ago
lucidrains / gateloop-transformer
Implementation of GateLoop Transformer in Pytorch and Jax
☆90Updated last year
facebookresearch / w2ot
Euclidean Wasserstein-2 optimal transportation
☆47Updated 2 years ago
andres-fr / skerch
Sketched linear operations for PyTorch
☆97Updated 3 weeks ago
Sea-Snell / grokking
unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆79Updated 3 years ago
ludwigwinkler / JaxLightning
Running Jax in PyTorch Lightning
☆114Updated 11 months ago
aks2203 / deep-thinking
A centralized place for deep thinking code and experiments
☆87Updated 2 years ago
joneswack / dp-rfs
This repository contains PyTorch implementations of various random feature maps for dot product kernels.
☆22Updated last year
ruke1ire / RTF
A State-Space Model with Rational Transfer Function Representation.
☆83Updated last year
optimizedlearning / mechanic
☆36Updated last year