Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation
☆14Jan 2, 2026Updated 4 months ago
Alternatives and similar repositories for mup
Users that are interested in mup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆24Jun 4, 2024Updated last year
- Code for the paper "Data Attribution for Text-to-Image Models by Unlearning Synthesized Images."☆17May 23, 2025Updated 11 months ago
- Code for the paper "Function-Space Learning Rates"☆24Jun 3, 2025Updated 11 months ago
- ☆27May 3, 2024Updated 2 years ago
- NanoGPT (124M) in 5 minutes☆15Feb 14, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Website for CSE 234, Winter 2025☆15Mar 24, 2025Updated last year
- Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"☆24Apr 30, 2025Updated last year
- Code and data for paper "(How) do Language Models Track State?"☆22Mar 31, 2025Updated last year
- ☆37Dec 12, 2025Updated 4 months ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆87Jul 28, 2024Updated last year
- ☆52Mar 14, 2025Updated last year
- Official implementation of: "Online Marker-free Extrinsic Camera Calibration using Person Keypoint Detections" by Pätzold, Bultmann & Beh…☆23Feb 1, 2024Updated 2 years ago
- run tinygrad kernels on esp32☆14Nov 28, 2023Updated 2 years ago
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆39Jan 23, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Dec 27, 2023Updated 2 years ago
- Official implementation of Categorical Flow Maps on text.☆56Feb 16, 2026Updated 2 months ago
- A flat container abstraction for Rust☆16Nov 24, 2025Updated 5 months ago
- [ICML 2025] No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces (official repository)☆44Aug 7, 2025Updated 8 months ago
- Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)☆15Jul 25, 2023Updated 2 years ago
- ☆13Nov 5, 2024Updated last year
- Implementation of "Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions"☆24Aug 27, 2024Updated last year
- Code for Tangent Model Composition for Ensembling and Continual Fine-tuning (ICCV 2023) and Tangent Transformers for Composition, Privacy…☆14May 14, 2024Updated last year
- ☆41Jan 12, 2026Updated 3 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A Declarative Language for Expressing Partial World Knowledge to Reinforcement Learning Agents☆17Jan 19, 2024Updated 2 years ago
- Cross-connect stdin and stdout of 2 processes and show outputs from each. (No longer maintained)☆16Nov 18, 2020Updated 5 years ago
- Help protect against malicious build scripts☆27Apr 26, 2026Updated last week
- TABR-BERT: an Accurate and Robust BERT-based Transfer Learning Model for TCR-pMHC Interaction Prediction☆12Jul 19, 2024Updated last year
- ☆12Jul 30, 2025Updated 9 months ago
- Automatic identification of regions in the latent space of a model that correspond to unique concepts, namely to concepts with a semantic…☆14Nov 22, 2023Updated 2 years ago
- Calculate allowed interactions in QED☆10Nov 2, 2022Updated 3 years ago
- Concept Learning Dynamics☆16Oct 29, 2024Updated last year
- Web-app meant for qp.metakgp.org☆21Dec 8, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆308Jul 15, 2024Updated last year
- ☆14May 15, 2024Updated last year
- Higher Order SVD implementation in PyTorch☆13Nov 14, 2022Updated 3 years ago
- ☆30Dec 2, 2024Updated last year
- A simple Dataset generator for Moving Mnist☆14May 26, 2023Updated 2 years ago
- Experimental GPU language with meta-programming☆31Sep 6, 2024Updated last year
- Package to rebalance and harvest tax losses in an ETF portfolio☆28Oct 22, 2024Updated last year