teddykoker / grokkingLinks
PyTorch implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆36Updated 3 years ago
Alternatives and similar repositories for grokking
Users that are interested in grokking are comparing it to the libraries listed below
Sorting:
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆62Updated 4 years ago
- paper lists and information on mean-field theory of deep learning☆78Updated 6 years ago
- Code for the paper: "Tensor Programs II: Neural Tangent Kernel for Any Architecture"☆105Updated 5 years ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆79Updated 3 years ago
- Hessian spectral density estimation in TF and Jax☆123Updated 4 years ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆59Updated 3 years ago
- DeepOBS: A Deep Learning Optimizer Benchmark Suite☆106Updated last year
- ☆100Updated 3 years ago
- Code for NeurIPS 2019 paper: "Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes…☆246Updated 5 years ago
- ☆67Updated 6 years ago
- ☆191Updated 2 months ago
- Structured matrices for compressing neural networks☆67Updated last year
- Convolutional Neural Tangent Kernel☆113Updated 5 years ago
- ☆170Updated last year
- Parameter-Free Optimizers for Pytorch☆130Updated last year
- CHOP: An optimization library based on PyTorch, with applications to adversarial examples and structured neural network training.☆78Updated last year
- ☆37Updated 3 years ago
- Repo to accompany paper "Implicit Self-Regularization in Deep Neural Networks..."☆45Updated 6 years ago
- Omnigrok: Grokking Beyond Algorithmic Data☆61Updated 2 years ago
- codebase for "A Theory of the Inductive Bias and Generalization of Kernel Regression and Wide Neural Networks"☆49Updated 2 years ago
- 🧀 Pytorch code for the Fromage optimiser.☆127Updated last year
- A minimal implementation of a VAE with BinConcrete (relaxed Bernoulli) latent distribution in TensorFlow.☆23Updated 5 years ago
- ☆26Updated 2 years ago
- A centralized place for deep thinking code and experiments☆86Updated 2 years ago
- [NeurIPS'19] Deep Equilibrium Models Jax Implementation☆40Updated 4 years ago
- Study on the applicability of Direct Feedback Alignment to neural view synthesis, recommender systems, geometric learning, and natural la…☆90Updated 3 years ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆179Updated last week
- Explores the ideas presented in Deep Ensembles: A Loss Landscape Perspective (https://arxiv.org/abs/1912.02757) by Stanislav Fort, Huiyi …☆65Updated 5 years ago
- ☆54Updated last year
- Official codebase for "Distribution-Free, Risk-Controlling Prediction Sets"☆85Updated last year