π§± Modula software package
β335Aug 18, 2025Updated 10 months ago
Alternatives and similar repositories for modula
Users that are interested in modula are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the paper "Function-Space Learning Rates"β23Jun 3, 2025Updated last year
- Combining SOAP and MUONβ22Feb 11, 2025Updated last year
- Experiment of using Tangent to autodiff tritonβ82Jan 22, 2024Updated 2 years ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditionβ¦β198May 30, 2026Updated 3 weeks ago
- β34Oct 4, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A single-line modification to any (dualizer-based) optimizer that allows the optimizer to adapt to the scale of the gradients as they chaβ¦β19Jan 11, 2025Updated last year
- [Poster; ICLR 2026] [Oral; Neurips OPT2024] ΞΌLO: Compute-Efficient Meta-Generalization of Learned Optimizersβ16Apr 15, 2026Updated 2 months ago
- Efficient optimizersβ334Jun 21, 2026Updated last week
- A library for unit scaling in PyTorchβ134Jul 11, 2025Updated 11 months ago
- Supporting code for the blog post on modular manifolds.β122Sep 26, 2025Updated 9 months ago
- β68Apr 8, 2026Updated 2 months ago
- supporting pytorch FSDP for optimizersβ84Dec 8, 2024Updated last year
- Don't just regulate gradients like in Muon, regulate the weights tooβ32Jul 30, 2025Updated 10 months ago
- An implementation of PSGD Kron second-order optimizer for PyTorchβ102Jul 24, 2025Updated 11 months ago
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Flash-Muon: An Efficient Implementation of Muon Optimizerβ257Jun 15, 2025Updated last year
- WIPβ96Aug 13, 2024Updated last year
- Maximal Update Parametrization (ΞΌP) with Flax & Optax.β16Dec 27, 2023Updated 2 years ago
- β63Oct 3, 2024Updated last year
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adamβ88Jul 28, 2024Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β197Jan 19, 2026Updated 5 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.β19Jul 24, 2025Updated 11 months ago
- Dion optimizer algorithmβ492Updated this week
- Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notationβ14Jan 2, 2026Updated 5 months ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Understand and test language model architectures on synthetic tasks.β277Mar 22, 2026Updated 3 months ago
- Compositional Linear Algebraβ516Aug 1, 2025Updated 10 months ago
- β271Dec 2, 2024Updated last year
- Tile primitives for speedy kernelsβ3,497Jun 15, 2026Updated 2 weeks ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 secondsβ381Nov 15, 2025Updated 7 months ago
- β45Nov 1, 2025Updated 7 months ago
- β305Jul 15, 2024Updated last year
- Schedule-Free Optimization in PyTorchβ2,307Jun 18, 2026Updated last week
- β13Mar 10, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weightsβ19Oct 9, 2022Updated 3 years ago
- Official implementation of 'A Large-Scale Exploration of mu-Transfer'β32Jun 5, 2025Updated last year
- 4-bit Shampoo for Memory-Efficient Network Training (NeurIPS 2024)β13Feb 13, 2025Updated last year
- Uncertainty quantification with PyTorchβ384Apr 1, 2026Updated 2 months ago
- Accelerated First Order Parallel Associative Scanβ198Jan 7, 2026Updated 5 months ago
- Unofficial JAX implementation of the SOAP optimizer (https://arxiv.org/abs/2409.11321)β27Jan 9, 2026Updated 5 months ago
- Minimal but scalable implementation of large language models in JAXβ34Nov 28, 2025Updated 7 months ago