Analyze the dynamic stability of SGD
☆13Nov 25, 2018Updated 7 years ago
Alternatives and similar repositories for sgd.stability
Users that are interested in sgd.stability are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Apr 19, 2020Updated 6 years ago
- Summer course on mathematical theory of deep learning☆55Jul 31, 2019Updated 6 years ago
- ☆10Apr 23, 2026Updated 2 months ago
- Open hardware baseboard for Nvidia Jetson Thor AGX T5000 System on Module☆33Apr 1, 2026Updated 3 months ago
- Datasets for Hyperparameter Optimization of Neural Machine Translation☆10Aug 19, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Towards Understanding Sharpness-Aware Minimization [ICML 2022]☆38Jun 14, 2022Updated 4 years ago
- Code to reproduce some of the figures in the paper "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"☆147Apr 24, 2017Updated 9 years ago
- code to reproduce the empirical results in the research paper☆40Oct 12, 2021Updated 4 years ago
- a cargo subcommand to package your application into a docker image☆19Feb 19, 2020Updated 6 years ago
- 《华章数学译丛》☆15Feb 4, 2025Updated last year
- Web上に公開されている小説をスクレイピングして青空文庫形式のテキストにする☆19Feb 9, 2017Updated 9 years ago
- ☆17Aug 22, 2021Updated 4 years ago
- Official Implementation of "Transferring Inductive Biases Through Knowledge Distillation"☆15Jun 3, 2020Updated 6 years ago
- ☆10Aug 18, 2016Updated 9 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆13Oct 8, 2021Updated 4 years ago
- Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians (ICML 2019)☆16Apr 27, 2019Updated 7 years ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆73Sep 25, 2024Updated last year
- Implementation of Monte Carlo Word Movers Distance in Python with TensorFlow☆12Sep 12, 2016Updated 9 years ago
- Large-batch Training, Neural Network Optimization☆10Nov 8, 2019Updated 6 years ago
- Machine learning course using Python☆13Apr 26, 2022Updated 4 years ago
- Really fast readability☆19Apr 25, 2024Updated 2 years ago
- Belief Propagation Network for Hard Inductive Semi-Supervised Learning (IJCAI 2019)☆21Jul 6, 2023Updated 2 years ago
- DeepSig's dataset for Machine Learning of Software Defined Radio☆23Sep 15, 2018Updated 7 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Python implementation for Combining Latent Space and Structured Kernels for Bayesian Optimization over Combinatorial Spaces.☆13Nov 30, 2021Updated 4 years ago
- 原文由 Eric S. Raymond 编写,指导如何正确提出问题从而获得满意的答案☆18Jan 5, 2021Updated 5 years ago
- ☆10Apr 23, 2021Updated 5 years ago
- [ICML 2019] The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects☆15Apr 12, 2020Updated 6 years ago
- ☆17Feb 4, 2025Updated last year
- ☆21Jul 12, 2018Updated 7 years ago
- An open-source book that chronicles the evolution of artificial intelligence — from its origins to the present and beyond. 一部持续更新的开源电子书,讲…☆37Sep 9, 2025Updated 9 months ago
- Code for the paper "Understanding the Role of Momentum in Stochastic Gradient Methods"☆14Oct 27, 2019Updated 6 years ago
- Chainer implementation of CIFAR-10 dataset training☆12Dec 7, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code and results accompanying our paper titled Leveraging Unlabeled Data to Predict Out-of-Distribution Performance at ICLR 2022☆10Dec 8, 2022Updated 3 years ago
- Github Repo for ICML 2022 paper: Communication-Efficient Adaptive Federated Learning☆10Nov 18, 2022Updated 3 years ago
- Experiments from our work Uncertainty Quantification and Deep Ensemble☆10Nov 1, 2021Updated 4 years ago
- Code for our ACL '23 paper titled "Grokking of Hierarchical Structure in Vanilla Transformers"☆26Oct 8, 2023Updated 2 years ago
- Xmixers: A collection of SOTA efficient token/channel mixers☆28Sep 4, 2025Updated 10 months ago
- Single Instruction Multiple Threads GPU Core with textbook Streaming Multi-Processor features☆68Jan 30, 2026Updated 5 months ago
- [Poster; ICLR 2026] [Oral; Neurips OPT2024] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers☆16Apr 15, 2026Updated 2 months ago