riverstone496 / awesome-second-order-optimization
โ25Updated last year
Alternatives and similar repositories for awesome-second-order-optimization:
Users that are interested in awesome-second-order-optimization are comparing it to the libraries listed below
- supporting pytorch FSDP for optimizersโ75Updated last month
- โ146Updated last month
- ๐งฑ Modula software packageโ132Updated this week
- Implementation of PSGD optimizer in JAXโ26Updated 2 weeks ago
- โ48Updated 11 months ago
- โ75Updated 6 months ago
- โ53Updated 11 months ago
- WIPโ92Updated 5 months ago
- โ50Updated 3 months ago
- โ31Updated 9 months ago
- An implementation of PSGD Kron second-order optimizer for PyTorchโ21Updated 2 weeks ago
- A MAD laboratory to improve AI architecture designs ๐งชโ102Updated last month
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resourcesโ119Updated this week
- Experiment of using Tangent to autodiff tritonโ74Updated 11 months ago
- A basic pure pytorch implementation of flash attentionโ16Updated 2 months ago
- Flow-matching algorithms in JAXโ82Updated 5 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"โ66Updated 2 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]โ58Updated 3 months ago
- Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Ruleโ73Updated 2 weeks ago
- โ37Updated 9 months ago
- โ49Updated 7 months ago
- Normalized Transformer (nGPT)โ145Updated last month
- โ33Updated 4 months ago
- Stick-breaking attentionโ41Updated this week
- 94% on CIFAR-10 in 2.6 seconds ๐จ 96% in 27 secondsโ195Updated last month
- โ51Updated 7 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingโ121Updated 9 months ago
- Code for https://arxiv.org/abs/2406.04329โ51Updated last month
- nanoGPT-like codebase for LLM trainingโ83Updated this week
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.โ30Updated last month