Optimization algorithm which fits a ResNet to CIFAR-10 5x faster than SGD / Adam (with terrible generalization)
☆14Oct 20, 2023Updated 2 years ago
Alternatives and similar repositories for top-sgd
Users that are interested in top-sgd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆34Jan 25, 2024Updated 2 years ago
- MATLAB MEX implementation of SVRG-SBB algorithms☆12Nov 28, 2017Updated 8 years ago
- Adaptive gradient descent without descent☆53Oct 12, 2021Updated 4 years ago
- Source code for paper "Trajectory of Alternating Direction Method of Multipliers and Adaptive Acceleration" of NeurIPS 2019☆10Jan 25, 2024Updated 2 years ago
- 最优化方法、凸优化课程作业代码☆18Jan 31, 2020Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official implementation of Bayes Conditional Distribution Estimation for Knowledge Distillation Based on Conditional Mutual Information☆11Sep 28, 2023Updated 2 years ago
- ☆11Dec 8, 2022Updated 3 years ago
- ☆16May 3, 2024Updated 2 years ago
- Source code of "Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers" EMNLP 2025☆17Jan 12, 2026Updated 3 months ago
- This is the public repo for the course HMMA238 'Software Development'☆11Apr 20, 2021Updated 5 years ago
- Automatic Integration for Neural Spatio-Temporal Point Process models (AI-STPP) is a new paradigm for exact, efficient, non-parametric inf…☆25Oct 14, 2024Updated last year
- ☆10Jul 6, 2021Updated 4 years ago
- HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arx…☆29Feb 17, 2025Updated last year
- ☆17Dec 7, 2025Updated 5 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code for the API, workload execution, and agents underlying the LLMail-Inject Adpative Prompt Injection Challenge☆23Apr 9, 2026Updated last month
- ☆10Apr 8, 2021Updated 5 years ago
- JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"☆19Jun 10, 2023Updated 2 years ago
- Implementations of the algorithms described in the paper: On the Convergence Theory for Hessian-Free Bilevel Algorithms.☆11Nov 1, 2024Updated last year
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 2 years ago
- ☆22Jan 23, 2024Updated 2 years ago
- Experiments with Super-Universal Newton method.☆13Aug 12, 2022Updated 3 years ago
- ClockBench - Visual Reasoning AI Benchmark☆31Sep 4, 2025Updated 8 months ago
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Mar 7, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Software dev. for data science (Python)☆16May 1, 2025Updated last year
- El0ps: An Exact L0-Problem Solver☆13Jan 6, 2026Updated 4 months ago
- Code for CVPR 2023 Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization☆13Mar 27, 2023Updated 3 years ago
- A Julia package for adaptive proximal gradient and primal-dual algorithms