[ICML 2025] Official Pytorch code for "SASSHA: Sharpness-aware Adaptive Second-order Optimization With Stable Hessian Approximation"
☆23Aug 11, 2025Updated 9 months ago
Alternatives and similar repositories for Sassha
Users that are interested in Sassha are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Nov 10, 2025Updated 6 months ago
- [UAI 2025] Official implementation of "Critical Influence of Overparameterization on Sharpness-aware Minimization"☆20May 14, 2025Updated last year
- [ICML 2023] Official implementation of "A Closer Look at the Intervention Procedure of Concept Bottleneck Models"☆28Feb 19, 2024Updated 2 years ago
- A Signal Propagation Perspective for Pruning Neural Networks at Initialization☆14Jun 23, 2020Updated 5 years ago
- Official implementation of "Multi-armed Bandit Algorithm against Strategic Replication"☆14May 17, 2022Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆76Dec 7, 2024Updated last year
- This repository regroups learning ressources about performance estimation problems☆15Mar 18, 2026Updated 2 months ago
- ☆24Oct 24, 2021Updated 4 years ago
- Pytorch code for experiments on Linear Transformers☆24Jan 12, 2024Updated 2 years ago
- [ICLR 25] A novel framework for building intrinsically interpretable LLMs with human-understandable concepts to ensure safety, reliabilit…☆33Feb 5, 2026Updated 4 months ago
- [AAAI 2023] Official implementation of 'Anonymization for Skeleton Action Recognition'☆29Dec 29, 2022Updated 3 years ago
- Implementation of Effective Sparsification of Neural Networks with Global Sparsity Constraint☆31Mar 24, 2022Updated 4 years ago
- DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures☆32Aug 13, 2020Updated 5 years ago
- Minimal pretraining script for language modeling in PyTorch. Supporting torch compilation and DDP. It includes a model implementation and…☆48Apr 27, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ProxQuant: Quantized Neural Networks via Proximal Operators☆30Feb 19, 2019Updated 7 years ago
- Official Github for Wharton STAT 4830☆57Apr 16, 2026Updated last month
- Repository for our NeurIPS 2022 paper "Concept Embedding Models", our NeurIPS 2023 paper "Learning to Receive Help", and our ICML 2025 pa…☆78Apr 5, 2026Updated 2 months ago
- ☆194Updated this week
- DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule☆63Aug 23, 2023Updated 2 years ago
- Lightweight torch implementation of rigl, a sparse-to-sparse optimizer.☆60Nov 17, 2021Updated 4 years ago
- Code for the paper "Post-hoc Concept Bottleneck Models". Spotlight @ ICLR 2023☆94May 20, 2024Updated 2 years ago
- Code for "Picking Winning Tickets Before Training by Preserving Gradient Flow" https://openreview.net/pdf?id=SkgsACVKPH☆105Feb 18, 2020Updated 6 years ago
- ☆632May 19, 2026Updated 3 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [NeurIPS 2020] "The Lottery Ticket Hypothesis for Pre-trained BERT Networks", Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Ya…☆142Dec 30, 2021Updated 4 years ago
- ☆235Feb 12, 2025Updated last year
- Approximating neural network loss landscapes in low-dimensional parameter subspaces for PyTorch☆355Nov 30, 2023Updated 2 years ago
- [NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models☆188Jan 1, 2025Updated last year
- ☆197Mar 5, 2026Updated 3 months ago
- ☆293Dec 16, 2024Updated last year
- Learning Sparse Neural Networks through L0 regularization☆249Jul 17, 2020Updated 5 years ago
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆417May 11, 2026Updated 3 weeks ago
- SAM: Sharpness-Aware Minimization (PyTorch)☆1,980Feb 21, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793☆458May 13, 2025Updated last year
- ☆303Aug 20, 2024Updated last year
- Implementations of ideas from recent papers☆389Dec 22, 2020Updated 5 years ago
- This is a curated list for Information Bottleneck Principle, in memory of Professor Naftali Tishby.☆397Feb 12, 2026Updated 3 months ago
- Awesome LLM compression research papers and tools.☆1,841Feb 23, 2026Updated 3 months ago
- Tiny PyTorch library for maintaining a moving average of a collection of parameters.☆445Oct 2, 2024Updated last year
- Papers for deep neural network compression and acceleration☆401Jun 21, 2021Updated 4 years ago