JesseFarebro / flax-mupView external linksLinks
Maximal Update Parametrization (μP) with Flax & Optax.
☆16Dec 27, 2023Updated 2 years ago
Alternatives and similar repositories for flax-mup
Users that are interested in flax-mup are comparing it to the libraries listed below
Sorting:
- (EasyDel Former) is a utility library designed to simplify and enhance the development in JAX☆29Feb 2, 2026Updated last week
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆32Jun 5, 2025Updated 8 months ago
- Implementation of PSGD optimizer in JAX☆35Dec 31, 2024Updated last year
- Minimal but scalable implementation of large language models in JAX☆35Nov 28, 2025Updated 2 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆18Jul 24, 2025Updated 6 months ago
- ☆18Aug 24, 2024Updated last year
- ☆23Jun 18, 2024Updated last year
- Train vision models using JAX and 🤗 transformers☆100Dec 14, 2025Updated 2 months ago
- Your favourite classical machine learning algos on the GPU/TPU☆21Dec 14, 2025Updated 2 months ago
- A simple, performant and scalable JAX-based world modeling codebase.☆135Jan 15, 2026Updated last month
- Stainless neural networks in JAX☆34Feb 3, 2026Updated last week
- A comprehensive JAX/NNX library for diffusion and flow matching generative algorithms, featuring DiT (Diffusion Transformer) and its vari…☆131Oct 16, 2025Updated 4 months ago
- JAX implementation of the Mistral 7b v0.2 model☆35Jul 3, 2024Updated last year
- Automatically take good care of your preemptible TPUs☆37May 15, 2023Updated 2 years ago
- The official starter-kit for NeurIPS 2025 mind games competition☆21Jul 27, 2025Updated 6 months ago
- JAX reimplementation of the DeepMind paper "Genie: Generative Interactive Environments"☆100Jan 23, 2025Updated last year
- Using JAX to generate piano music as MIDI☆39Nov 28, 2023Updated 2 years ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆86Jul 28, 2024Updated last year
- ☆91Sep 19, 2022Updated 3 years ago
- Incognito Proxy chrome extension☆10Sep 27, 2023Updated 2 years ago
- Partially Observable Multi-Agent RL with Transformers☆17Updated this week
- ☆10Oct 9, 2025Updated 4 months ago
- Residual Quantization Autoencoder, used for interpreting LLMs☆13Jan 1, 2025Updated last year
- Fine-tuning GPT-2 to generate research paper abstracts☆12Apr 28, 2021Updated 4 years ago
- A fine-mapping method integrating GWAS summary statistics and functional annotation data☆11Dec 28, 2023Updated 2 years ago
- About The dataset was recorded on the Husky robotics platform on the university campus and consists of 5 tracks recorded at different tim…☆11Mar 25, 2025Updated 10 months ago
- ☆15Jul 27, 2023Updated 2 years ago
- Machine Learning eXperiment Utilities☆48Jul 29, 2025Updated 6 months ago
- ☆13Jun 22, 2025Updated 7 months ago
- Neural multiclass ab initio reconstruction for cryo-EM.☆13Dec 5, 2024Updated last year
- FAQ for University of CaliforniaSanta Cruz 2019 Incoming Grads☆11Apr 4, 2019Updated 6 years ago
- Simulating the fractional quantum Hall effect with neural network variational Monte Carlo☆20Sep 12, 2025Updated 5 months ago
- Compression primitives for uplink compression in Federated Learning that are compatible with Secure Aggregation.☆10Jul 27, 2022Updated 3 years ago
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- ☆20Dec 1, 2025Updated 2 months ago
- Code to reproduce experiments in Markovian Flow Matching: Accelerating MCMC with Continuous Normalizing Flows☆13May 23, 2024Updated last year
- 4-bit Shampoo for Memory-Efficient Network Training (NeurIPS 2024)☆13Feb 13, 2025Updated last year
- A CLI tool that downloads the materials of a course hosted on the MET (GUC) website, and organizes the materials into their respective fo…☆10Mar 25, 2022Updated 3 years ago
- ☆12Oct 21, 2023Updated 2 years ago