This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent"
☆39Mar 2, 2023Updated 3 years ago
Alternatives and similar repositories for optimizer
Users that are interested in optimizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Oct 12, 2022Updated 3 years ago
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆35Sep 28, 2025Updated 9 months ago
- ☆18Jan 17, 2024Updated 2 years ago
- The Happy Faces Benchmark☆15Jul 20, 2023Updated 2 years ago
- We define and estimate smooth unique information of samples with respect to classifier weights and predictions. We compute these quantiti…☆11Mar 9, 2021Updated 5 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code for NIPS 2015 "Gradient-Free Hamiltonian Monte Carlo via Effecient Kernel Exponential Families"☆26Jun 7, 2018Updated 8 years ago
- Code to support the guide to logical induction for software engineers☆11Mar 24, 2025Updated last year
- Effective Attention Sheds Light On Interpretability - Findings of ACL2021☆11May 16, 2021Updated 5 years ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- Official code for Deep Bayesian Video Frame Interpolation (ECCV2022)☆18May 29, 2023Updated 3 years ago
- Preparing for ML Interviews.☆53Jan 12, 2026Updated 5 months ago
- Scalable Computation of Hessian Diagonals☆14Jun 2, 2024Updated 2 years ago
- [NeurIPS2023] "Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced Transfer Learning" by Yihua Zhang*, Yimeng Zhang*,…☆14Oct 12, 2023Updated 2 years ago
- Update: Ignore this repo, check out @lucidrains' implementation https://github.com/lucidrains/musiclm-pytorch☆15Jan 27, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code of the paper: Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function☆13Apr 13, 2026Updated 2 months ago
- ☆35Sep 23, 2022Updated 3 years ago
- Code for Generalization Guarantees for (Multi-Modal) Imitation Learning☆11Jul 14, 2022Updated 3 years ago
- A simple shellscript for splitting the PDF of a paper into the main body and an appendix.☆18Jun 1, 2020Updated 6 years ago
- Official implementation of Adaptive Feature Transfer (AFT)☆24Jun 12, 2024Updated 2 years ago
- [NeurIPS 2021] code for "Taxonomizing local versus global structure in neural network loss landscapes" https://arxiv.org/abs/2107.11228☆20Jan 7, 2022Updated 4 years ago
- Codebase for the paper HawkI: HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View☆13Jun 5, 2024Updated 2 years ago
- Effect of tokenization on transformers for biological sequence☆23Dec 31, 2025Updated 6 months ago
- Repo for our work "Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence"☆21Jun 2, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code for replicating experiments from the paper, Preference Exploration for Efficient Bayesian Optimization with Multiple Outcomes, publi…☆14Jun 22, 2023Updated 3 years ago
- SqueezeNet in Tensorflow☆10Jun 7, 2017Updated 9 years ago
- T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation (ICCV'25)☆55Oct 6, 2025Updated 8 months ago
- ☆39Oct 21, 2022Updated 3 years ago
- ☆12Aug 17, 2022Updated 3 years ago
- (ICLR 2026) Optimas: Optimizing Compound AI Systems☆80Feb 6, 2026Updated 4 months ago
- Training vision models with full-batch gradient descent and regularization☆40Feb 14, 2023Updated 3 years ago
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 3 years ago
- Experiments with AllenNLP on semantic parsing datasets☆17Dec 29, 2018Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Towards Unified and Effective Domain Generalization☆34Nov 27, 2023Updated 2 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Oct 22, 2023Updated 2 years ago
- Cheat sheet for interacting with the SLURM scheulder☆17Jun 1, 2017Updated 9 years ago
- Decision Transformer JAX - Reproduction of 'Decision Transformer: Reinforcement Learning via Sequence Modeling' in JAX and Haiku☆13Aug 14, 2024Updated last year
- Source code to accompany research paper on training multi token prediction language models using self-distillation.☆39Feb 21, 2026Updated 4 months ago
- Single-file SAC-N implementation on jax with flax and equinox. 10x faster than pytorch☆56May 21, 2023Updated 3 years ago
- Normalized Wasserstein for Mixture Distributions☆11Mar 24, 2023Updated 3 years ago