zoq / Awesome-OptimizerLinks
Collect optimizer related papers, data, repositories
☆94Updated 8 months ago
Alternatives and similar repositories for Awesome-Optimizer
Users that are interested in Awesome-Optimizer are comparing it to the libraries listed below
Sorting:
- Distributed K-FAC preconditioner for PyTorch☆89Updated this week
- Implementation of "Gradients without backpropagation" paper (https://arxiv.org/abs/2202.08587) using functorch☆110Updated 2 years ago
- TensorLy-Torch: Deep Tensor Learning with TensorLy and PyTorch☆80Updated last year
- Neural Tangent Kernel Papers☆115Updated 6 months ago
- DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule☆63Updated last year
- Create animations for the optimization trajectory of neural nets☆157Updated last year
- ☆206Updated 8 months ago
- Omnigrok: Grokking Beyond Algorithmic Data☆61Updated 2 years ago
- optimizer & lr scheduler & loss function collections in PyTorch☆327Updated this week
- 😎 A curated list of tensor decomposition resources for model compression.☆77Updated last week
- ☆70Updated 8 months ago
- ☆142Updated last month
- (NeurIPS 2024) QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation☆30Updated 8 months ago
- summer school materials☆44Updated 2 years ago
- Pytorch implementation of KFAC - this is a port of https://github.com/tensorflow/kfac/☆26Updated last year
- About A collection of AWESOME things about information geometry Topics☆164Updated last year
- Parameter-Free Optimizers for Pytorch☆130Updated last year
- Code for papers Linear Algebra with Transformers (TMLR) and What is my Math Transformer Doing? (AI for Maths Workshop, Neurips 2022)☆75Updated 11 months ago
- A thoroughly investigated survey for tensorial neural networks.☆137Updated 6 months ago
- A general-purpose, deep learning-first library for constrained optimization in PyTorch☆136Updated last month
- ☆53Updated 10 months ago
- Optimization algorithm which fits a ResNet to CIFAR-10 5x faster than SGD / Adam (with terrible generalization)☆14Updated last year
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆66Updated 10 months ago
- ☆232Updated 5 months ago
- Deep Learning & Information Bottleneck☆61Updated 2 years ago
- code associated with paper "Sparse Bayesian Optimization"☆26Updated last year
- Mutual information estimators and benchmark☆51Updated 6 months ago
- Pytorch code for experiments on Linear Transformers☆21Updated last year
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆60Updated 4 months ago
- Non official implementation of the Linear Recurrent Unit (LRU, Orvieto et al. 2023)☆55Updated 3 weeks ago