epfml / REQ
☆14Updated 3 months ago
Related projects: ⓘ
- Towards Understanding Sharpness-Aware Minimization [ICML 2022]☆34Updated 2 years ago
- A modern look at the relationship between sharpness and generalization [ICML 2023]☆42Updated last year
- Source code of "What can linearized neural networks actually say about generalization?☆17Updated 2 years ago
- Training vision models with full-batch gradient descent and regularization☆37Updated last year
- ☆13Updated last year
- SGD with large step sizes learns sparse features [ICML 2023]☆31Updated last year
- Simple CIFAR10 ResNet example with JAX.☆19Updated 3 years ago
- Code for the paper "The Journey, Not the Destination: How Data Guides Diffusion Models"☆19Updated 9 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [arXiv, Oct 2023]☆41Updated 11 months ago
- [NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models☆43Updated last year
- [ICML 2024] SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)☆19Updated 2 months ago
- Simple data balancing baselines for worst-group-accuracy benchmarks.☆39Updated 10 months ago
- Pytorch code for "Improving Self-Supervised Learning by Characterizing Idealized Representations"☆40Updated last year
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆25Updated 2 years ago
- Code for testing DCT plus Sparse (DCTpS) networks☆14Updated 3 years ago
- ☆55Updated 4 years ago
- ☆31Updated 7 months ago
- CIFAR-5m dataset☆39Updated 3 years ago
- ☆13Updated last year
- Provably (and non-vacuously) bounding test error of deep neural networks under distribution shift with unlabeled test data.☆9Updated 6 months ago
- ☆17Updated last year
- Gradient Starvation: A Learning Proclivity in Neural Networks☆59Updated 3 years ago
- Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians (ICML 2019)☆17Updated 5 years ago
- Code to implement the AND-mask and geometric mean to do gradient based optimization, from the paper "Learning explanations that are hard …☆39Updated 3 years ago
- Do input gradients highlight discriminative features? [NeurIPS 2021] (https://arxiv.org/abs/2102.12781)☆13Updated last year
- Latest Weight Averaging (NeurIPS HITY 2022)☆21Updated last year
- ☆40Updated last year
- ☆33Updated 3 years ago
- ☆33Updated last month
- ☆49Updated 3 years ago