[ICLR 2025 Spotlight] Code release for "Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training"
☆18Feb 20, 2025Updated last year
Alternatives and similar repositories for SAM-in-Late-Phase
Users that are interested in SAM-in-Late-Phase are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICML 2024] Code release for "On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm"☆11Feb 20, 2025Updated last year
- Welcome to the 'In Context Learning Theory' Reading Group☆30Nov 8, 2024Updated last year
- [NeurIPS 2023] Code release for "Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity"☆19Oct 19, 2023Updated 2 years ago
- [ICLR 2025] "Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond"☆17Feb 27, 2025Updated last year
- MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation☆16Sep 2, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code for reproducing the results in "How Well do Sparse Imagenet Models Transfer?", presented at CVPR 2022☆10Jun 3, 2022Updated 3 years ago
- Code for the paper "Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations"☆33Dec 4, 2025Updated 4 months ago
- ☆11May 22, 2024Updated last year
- Code related to ’Beyond spectral gap: The role of the topology in decentralized learning‘.☆14Jun 7, 2022Updated 3 years ago
- ☆11Apr 22, 2024Updated last year
- 浙江大学Beamer模板☆15May 19, 2022Updated 3 years ago
- Official code for "Vision Transformers with Self-Distilled Registers" (NeurIPS 2025 Spotlight)☆33Dec 6, 2025Updated 4 months ago
- ☆10Aug 13, 2019Updated 6 years ago
- [ICML 2019] The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects☆15Apr 12, 2020Updated 6 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Combining SOAP and MUON☆20Feb 11, 2025Updated last year
- ☆15Apr 29, 2024Updated last year
- one of my CSC courses in CUHK(SZ)☆19Dec 29, 2021Updated 4 years ago
- Un template per il piano di lavoro utile per gli studenti della laurea triennale in Computer Science @Unipadova che devono iniziare lo st…☆10May 20, 2018Updated 7 years ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 8 months ago
- Notes from the Computational Mathematics course held by professor Antonio Frangioni and professor Federico Poloni at University of Pisa☆12Aug 28, 2021Updated 4 years ago
- A curated list of papers of interesting empirical study and insight on deep learning. Continually updating...☆397Jan 7, 2026Updated 3 months ago
- Header-only, generic and dependency-free C++17 implementation of Heaps and Priority Queues☆19Feb 12, 2022Updated 4 years ago
- ☆23Nov 27, 2025Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICML 2023] Decentralized SGD and Average-direction SAM are Asymptotically Equivalent☆20Dec 4, 2023Updated 2 years ago
- one of my CSC courses in CUHK(SZ)☆20Apr 30, 2023Updated 2 years ago
- Official Implementation of CL-ALFRED (ICLR'24)☆31Oct 24, 2024Updated last year
- Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"☆21Feb 29, 2024Updated 2 years ago
- LINe: Out-of-Distribution Detection by Leveraging Important Neurons (CVPR 2023)☆13Jun 13, 2023Updated 2 years ago
- ☆14Oct 21, 2021Updated 4 years ago
- Hyper-networks for Unified Visual Representation (HUVR) use implicit neural representation to bridge the gap between understanding and ge…☆29Jan 23, 2026Updated 2 months ago
- While language static analyzer☆10Oct 13, 2020Updated 5 years ago
- [NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"☆40Jul 18, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- P4Control: Line-Rate Cross-Host Attack Prevention via In-Network Information Flow Control Enabled by Programmable Switches and eBPF☆11May 20, 2024Updated last year
- My wezterm config☆16Feb 18, 2023Updated 3 years ago
- CIFAR10 ResNets implemented in JAX+Flax☆12Apr 6, 2022Updated 4 years ago
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆39Nov 1, 2024Updated last year
- [NeurIPS'23] Binary Classification with Confidence Difference☆10May 13, 2024Updated last year
- The dataset repo of "CLCIFAR: CIFAR-Derived Benchmark Datasets with Human Annotated Complementary Labels" paper☆16Aug 8, 2025Updated 8 months ago
- ☆16May 18, 2023Updated 2 years ago