This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent"
☆39Mar 2, 2023Updated 2 years ago
Alternatives and similar repositories for optimizer
Users that are interested in optimizer are comparing it to the libraries listed below
Sorting:
- We define and estimate smooth unique information of samples with respect to classifier weights and predictions. We compute these quantiti…☆11Mar 9, 2021Updated 4 years ago
- ☆18Oct 12, 2022Updated 3 years ago
- Effect of tokenization on transformers for biological sequence☆22Dec 31, 2025Updated 2 months ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- Official code for Deep Bayesian Video Frame Interpolation (ECCV2022)☆18May 29, 2023Updated 2 years ago
- [ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"☆21Feb 16, 2025Updated last year
- Official implementation of Adaptive Feature Transfer (AFT)☆23Jun 12, 2024Updated last year
- Repository for the code assignment of the Deep Learning 1 course, Fall 2022 edition☆20Dec 9, 2022Updated 3 years ago
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆23Aug 18, 2024Updated last year
- ☆23Jul 7, 2023Updated 2 years ago
- Code for the PAPA paper☆27Nov 8, 2022Updated 3 years ago
- An Interpretable Self-Attention Network with block-attention and attention-attribution.☆12Sep 22, 2023Updated 2 years ago
- Code for the paper Self-Supervised Learning of Split Invariant Equivariant Representations☆31Sep 4, 2023Updated 2 years ago
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆33Sep 28, 2025Updated 5 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆32Aug 5, 2025Updated 6 months ago
- An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)☆33Jun 19, 2024Updated last year
- (ICLR 2026) Optimas: Optimizing Compound AI Systems☆76Feb 6, 2026Updated 3 weeks ago
- Training vision models with full-batch gradient descent and regularization☆39Feb 14, 2023Updated 3 years ago
- Preparing for ML Interviews.☆54Jan 12, 2026Updated last month
- ☆12Sep 21, 2023Updated 2 years ago
- rnalib: a python-based transcriptomics library☆11Jan 23, 2026Updated last month
- ☆11Mar 11, 2024Updated last year
- English-Chinese-Japanese translation dataset of the terms in Genshin Impact☆39Updated this week
- ☆10Sep 29, 2023Updated 2 years ago
- A python algorithm to change the pitch of the voice in real time☆13Dec 13, 2020Updated 5 years ago
- R Package for Bootstrap Unit Root Tests☆10May 5, 2025Updated 9 months ago
- A fast and accurate RNA secondary structure, end-to-end approach prediction method.☆12Jan 2, 2025Updated last year
- ☆11Feb 20, 2026Updated last week
- ☆36Sep 23, 2022Updated 3 years ago
- This repo contains the code to reproduce figures in my dissertation "Passive Imaging and Characterization of the Subsurface With Distribu…☆10Jun 14, 2018Updated 7 years ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Jupyter notebooks for analysis and figures related to the native organelle IP paper☆13Nov 13, 2025Updated 3 months ago
- RLMD: A Dataset for Road Line and Marking Segmentation (ICCE-TW 2023)☆14Aug 29, 2024Updated last year
- [ACM MM 2024 (Oral)] Official PyTorch Implementation of Paper "MovingColor: Seamless Fusion of Fine-grained Video Color Enhancement"☆11Dec 30, 2024Updated last year
- [Advanced Photonics Research, 2021] Control tightly focused fields via manipulating pupil functions☆10Dec 25, 2024Updated last year
- Spatial Seemingly Unrelated Regressions☆11Apr 22, 2022Updated 3 years ago
- ☆14Dec 18, 2024Updated last year
- ☆16Updated this week
- The sparse Bayesian learning sandbox☆11Jul 4, 2021Updated 4 years ago