RE-N-Y / sae
☆10Updated 4 months ago
Alternatives and similar repositories for sae:
Users that are interested in sae are comparing it to the libraries listed below
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆44Updated 3 weeks ago
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆23Updated 2 months ago
- RS-IMLE☆38Updated 3 months ago
- ☆51Updated last year
- Minimal Differentiable Image Reward Functions☆52Updated last month
- ☆33Updated 2 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated 2 weeks ago
- ☆32Updated 5 months ago
- Latent Diffusion Language Models☆68Updated last year
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆73Updated 8 months ago
- ☆52Updated 5 months ago
- A demo for the Direct Ascent Synthesis: Hidden Generative Capabilities in Discriminative Models paper (https://arxiv.org/abs/2502.07753)☆37Updated 3 weeks ago
- ☆25Updated 9 months ago
- ☆19Updated last week
- ☆21Updated last year
- ☆33Updated 6 months ago
- ☆22Updated 9 months ago
- ☆28Updated 8 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆26Updated 3 weeks ago
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 9 months ago
- Focused on fast experimentation and simplicity☆70Updated 3 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆26Updated 6 months ago
- ☆10Updated last year
- MEXMA: Token-level objectives improve sentence representations☆40Updated 2 months ago
- A general framework for inference-time scaling and steering of diffusion models with arbitrary rewards.☆115Updated last month
- ☆31Updated 2 months ago
- ☆27Updated 11 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆39Updated 5 months ago
- Minimal Implementation of Visual Autoregressive Modelling (VAR)☆29Updated last week
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆18Updated 5 months ago