Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation
☆14Jan 2, 2026Updated 2 months ago
Alternatives and similar repositories for mup
Users that are interested in mup are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆24Jun 4, 2024Updated last year
- Code for the paper "Data Attribution for Text-to-Image Models by Unlearning Synthesized Images."☆17May 23, 2025Updated 10 months ago
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 9 months ago
- ☆27May 3, 2024Updated last year
- NanoGPT (124M) in 5 minutes☆15Feb 14, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Website for CSE 234, Winter 2025☆13Mar 24, 2025Updated last year
- Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"☆23Apr 30, 2025Updated 10 months ago
- Code and data for paper "(How) do Language Models Track State?"☆22Mar 31, 2025Updated 11 months ago
- ☆37Dec 12, 2025Updated 3 months ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆86Jul 28, 2024Updated last year
- ☆50Mar 14, 2025Updated last year
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆38Jan 23, 2024Updated 2 years ago
- Official implementation of: "Online Marker-free Extrinsic Camera Calibration using Person Keypoint Detections" by Pätzold, Bultmann & Beh…☆23Feb 1, 2024Updated 2 years ago
- run tinygrad kernels on esp32☆13Nov 28, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Dec 27, 2023Updated 2 years ago
- Official implementation of Categorical Flow Maps on text.☆48Feb 16, 2026Updated last month
- [ICML 2025] No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces (official repository)☆40Aug 7, 2025Updated 7 months ago
- A flat container abstraction for Rust☆16Nov 24, 2025Updated 4 months ago
- Official code for the paper "Compositional Generalization from First Principles" (NeurIPS 2023)☆14Jul 25, 2023Updated 2 years ago
- ☆13Nov 5, 2024Updated last year
- Implementation of "Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions"☆24Aug 27, 2024Updated last year
- Code for Tangent Model Composition for Ensembling and Continual Fine-tuning (ICCV 2023) and Tangent Transformers for Composition, Privacy…☆13May 14, 2024Updated last year
- ☆40Jan 12, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A Declarative Language for Expressing Partial World Knowledge to Reinforcement Learning Agents☆17Jan 19, 2024Updated 2 years ago
- Cross-connect stdin and stdout of 2 processes and show outputs from each. (No longer maintained)☆16Nov 18, 2020Updated 5 years ago
- TABR-BERT: an Accurate and Robust BERT-based Transfer Learning Model for TCR-pMHC Interaction Prediction☆12Jul 19, 2024Updated last year
- Help protect against malicious build scripts☆27Mar 14, 2026Updated last week
- ☆12Jul 30, 2025Updated 7 months ago
- (TPAMI 2026) Learning Continuous Wasserstein Barycenter Space for Generalized All-in-One Image Restoration☆71Mar 6, 2026Updated 3 weeks ago
- Automatic identification of regions in the latent space of a model that correspond to unique concepts, namely to concepts with a semantic…☆14Nov 22, 2023Updated 2 years ago
- Calculate allowed interactions in QED☆10Nov 2, 2022Updated 3 years ago
- Experimental GPU language with meta-programming☆27Sep 6, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Concept Learning Dynamics☆16Oct 29, 2024Updated last year
- Web-app meant for qp.metakgp.org☆21Dec 8, 2022Updated 3 years ago
- ☆306Jul 15, 2024Updated last year
- ☆14May 15, 2024Updated last year
- Higher Order SVD implementation in PyTorch☆13Nov 14, 2022Updated 3 years ago
- ☆30Dec 2, 2024Updated last year
- A simple Dataset generator for Moving Mnist☆14May 26, 2023Updated 2 years ago