epfml / REQ
☆17Updated 8 months ago
Alternatives and similar repositories for REQ:
Users that are interested in REQ are comparing it to the libraries listed below
- Source code of "What can linearized neural networks actually say about generalization?☆20Updated 3 years ago
- Towards Understanding Sharpness-Aware Minimization [ICML 2022]☆35Updated 2 years ago
- A modern look at the relationship between sharpness and generalization [ICML 2023]☆43Updated last year
- ☆63Updated 2 months ago
- Simple CIFAR10 ResNet example with JAX.☆23Updated 3 years ago
- Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]☆26Updated last year
- [ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation☆12Updated last year
- SGD with large step sizes learns sparse features [ICML 2023]☆32Updated last year
- Code for testing DCT plus Sparse (DCTpS) networks☆14Updated 3 years ago
- ☆15Updated last year
- ☆10Updated 3 years ago
- Pytorch code for "Improving Self-Supervised Learning by Characterizing Idealized Representations"☆40Updated 2 years ago
- Training vision models with full-batch gradient descent and regularization☆37Updated 2 years ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆48Updated 9 months ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).☆58Updated 3 years ago
- Code for the paper "The Journey, Not the Destination: How Data Guides Diffusion Models"☆21Updated last year
- Code to implement the AND-mask and geometric mean to do gradient based optimization, from the paper "Learning explanations that are hard …☆39Updated 4 years ago
- Simple data balancing baselines for worst-group-accuracy benchmarks.☆41Updated last year
- ☆34Updated last year
- [ICML'21] Improved Contrastive Divergence Training of Energy Based Models☆62Updated 2 years ago
- Code release for REPAIR: REnormalizing Permuted Activations for Interpolation Repair☆46Updated last year
- [ICML 2024] SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)☆21Updated 3 months ago
- ☆54Updated 4 years ago
- Visualization of mean field and neural tangent kernel regime☆21Updated 6 months ago
- ☆51Updated 4 months ago
- CIFAR-5m dataset☆38Updated 4 years ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆60Updated 4 months ago
- nanoGPT-like codebase for LLM training☆89Updated this week
- ☆34Updated 2 years ago