MarlonBecker / MSAM
☆18Updated 11 months ago
Alternatives and similar repositories for MSAM:
Users that are interested in MSAM are comparing it to the libraries listed below
- [ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation☆12Updated last year
- ☆34Updated 11 months ago
- ☆33Updated 2 years ago
- ☆11Updated 2 years ago
- A modern look at the relationship between sharpness and generalization [ICML 2023]☆43Updated last year
- Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]☆25Updated last year
- Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation (ICML'24 Oral)☆13Updated 5 months ago
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆27Updated 2 months ago
- Towards Understanding Sharpness-Aware Minimization [ICML 2022]☆35Updated 2 years ago
- ☆16Updated 2 years ago
- ☆57Updated last year
- ☆62Updated last month
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆80Updated last year
- Code for the paper "Efficient Dataset Distillation using Random Feature Approximation"☆37Updated last year
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆47Updated 8 months ago
- Source code of "What can linearized neural networks actually say about generalization?☆19Updated 3 years ago
- Code for testing DCT plus Sparse (DCTpS) networks☆14Updated 3 years ago
- Bayesian Low-Rank Adaptation for Large Language Models☆29Updated 6 months ago
- source code for paper "Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models"☆20Updated 6 months ago
- Official code for "In Search of Robust Measures of Generalization" (NeurIPS 2020)☆28Updated 4 years ago
- ☆15Updated last year
- ☆13Updated 10 months ago
- Deep Learning & Information Bottleneck☆53Updated last year
- ☆27Updated last year
- Predicting Out-of-Distribution Error with the Projection Norm☆17Updated 2 years ago
- ☆16Updated 7 months ago
- [TPAMI 2023] Low Dimensional Landscape Hypothesis is True: DNNs can be Trained in Tiny Subspaces☆40Updated 2 years ago
- Implementation of the paper "Improving the Accuracy-Robustness Trade-off of Classifiers via Adaptive Smoothing".☆11Updated 11 months ago
- Distilling Model Failures as Directions in Latent Space☆46Updated last year
- This repository is the official implementation of Generalized Data Weighting via Class-level Gradient Manipulation (NeurIPS 2021)(http://…☆24Updated 2 years ago