Maximal Update Parametrization (μP) with Flax & Optax.
☆16Dec 27, 2023Updated 2 years ago
Alternatives and similar repositories for flax-mup
Users that are interested in flax-mup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆19Jul 24, 2025Updated 10 months ago
- Official implementation of 'A Large-Scale Exploration of mu-Transfer'☆32Jun 5, 2025Updated last year
- Implementation of PSGD optimizer in JAX☆35Dec 31, 2024Updated last year
- ☆24Jun 18, 2024Updated 2 years ago
- ☆18Aug 24, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- (EasyDel Former) is a utility library designed to simplify and enhance the development in JAX☆32Jun 11, 2026Updated last week
- Minimal but scalable implementation of large language models in JAX☆34Nov 28, 2025Updated 6 months ago
- A minimal command-line utility written in Rust for querying GPU status☆24Dec 21, 2025Updated 5 months ago
- Automatically take good care of your preemptible TPUs☆37May 15, 2023Updated 3 years ago
- ☆14Jul 26, 2023Updated 2 years ago
- Machine Learning eXperiment Utilities☆48Jul 29, 2025Updated 10 months ago
- Train vision models using JAX and 🤗 transformers☆103Dec 14, 2025Updated 6 months ago
- Code for the paper "The Journey, Not the Destination: How Data Guides Diffusion Models"☆25Dec 12, 2023Updated 2 years ago
- implementation of https://arxiv.org/pdf/2312.09299