apple / ml-actLinks
☆46Updated 7 months ago
Alternatives and similar repositories for ml-act
Users that are interested in ml-act are comparing it to the libraries listed below
Sorting:
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆52Updated 3 months ago
- ☆34Updated 9 months ago
- ☆79Updated 10 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆40Updated 8 months ago
- Synthetic Alphabet Dataset☆19Updated 2 months ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆80Updated 10 months ago
- ☆17Updated 7 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆43Updated 7 months ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆29Updated 7 months ago
- ☆51Updated last year
- Sparse Autoencoders for Stable Diffusion XL models.☆65Updated this week
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆66Updated 9 months ago
- ☆61Updated 7 months ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆54Updated last year
- A general framework for inference-time scaling and steering of diffusion models with arbitrary rewards.☆155Updated last week
- Official repo for the paper "Weight-based Decomposition: A Case for Bilinear MLPs"☆21Updated 6 months ago
- ☆53Updated 8 months ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆30Updated 2 years ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆28Updated 9 months ago
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆69Updated this week
- ☆32Updated 5 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆82Updated this week
- ☆52Updated last year
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆25Updated 5 months ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆19Updated last month
- Remasking Discrete Diffusion Models with Inference-Time Scaling☆26Updated 3 months ago
- PyTorch library for Active Fine-Tuning☆80Updated 4 months ago
- ☆81Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆134Updated this week