cloneofsimo/minSAE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cloneofsimo/minSAE)

cloneofsimo / minSAE

☆30

Alternatives and similar repositories for minSAE

Users that are interested in minSAE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ethansmith2000 / TransformerExperiments
View on GitHub
☆19Dec 4, 2025Updated 7 months ago
cloneofsimo / ezmup
View on GitHub
Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam
☆88Jul 28, 2024Updated last year
cloneofsimo / repa-rf
View on GitHub
☆32Nov 4, 2024Updated last year
cloneofsimo / project_RF
View on GitHub
☆24Jun 4, 2024Updated 2 years ago
koayon / phil-interp-papers
View on GitHub
A curated reading list for researchers in the Philosophy of Interpretability
☆17Aug 17, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cloneofsimo / efae
View on GitHub
☆24Jun 18, 2024Updated 2 years ago
science-of-finetuning / sparsity-artifacts-crosscoders
View on GitHub
Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.
☆17Jul 6, 2026Updated 2 weeks ago
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated 2 years ago
YuchenJin / llm.c
View on GitHub
LLM training in simple, raw C/CUDA
☆15Dec 5, 2024Updated last year
science-of-finetuning / crosscoder_learning
View on GitHub
Modified to support crosscoder training.
☆27Jul 2, 2026Updated 2 weeks ago
Gengzigang / TokenSet
View on GitHub
Official PyTorch implementation of TokenSet.
☆129Mar 21, 2025Updated last year
Laz4rz / mup
View on GitHub
Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation
☆14Jan 2, 2026Updated 6 months ago
joao-siilva / studies
View on GitHub
Descrição diário da toda minha trajetória de estudos
☆15Jan 30, 2025Updated last year
HazyResearch / train-tk
View on GitHub
train with kittens!
☆66Oct 25, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cloneofsimo / min-fsdp
View on GitHub
☆93Jul 5, 2024Updated 2 years ago
Snektron / gpumode-amd-fp8-mm
View on GitHub
My submission for the GPUMODE/AMD fp8 mm challenge
☆29Jun 4, 2025Updated last year
cloneofsimo / scaling-guide
View on GitHub
WIP
☆96Aug 13, 2024Updated last year
evanatyourservice / psgd_jax
View on GitHub
Implementation of PSGD optimizer in JAX
☆36Dec 31, 2024Updated last year
andyljones / boardlaw
View on GitHub
Scaling scaling laws with board games.
☆53Jul 17, 2023Updated 3 years ago
cloneofsimo / insightful-nn-papers
View on GitHub
These papers will provide unique insightful concepts that will broaden your perspective on neural networks and deep learning
☆48Sep 3, 2023Updated 2 years ago
edeyneka / pdf-reader-extension
View on GitHub
☆13Mar 9, 2025Updated last year
cloneofsimo / karras-power-ema-tutorial
View on GitHub
☆53Jan 6, 2024Updated 2 years ago
curt-tigges / probity
View on GitHub
☆19Apr 10, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
cloneofsimo / zeroshampoo
View on GitHub
☆33Sep 10, 2024Updated last year
evanatyourservice / llm-jax
View on GitHub
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆19Jul 24, 2025Updated 11 months ago
dennishein / cpfgmpp_PCCT_denoising
View on GitHub
☆12Apr 2, 2024Updated 2 years ago
JesseFarebro / flax-mup
View on GitHub
Maximal Update Parametrization (μP) with Flax & Optax.
☆16Dec 27, 2023Updated 2 years ago
numfocus / project-fundraising
View on GitHub
List of new Project Fundraising Opportunities for NumFOCUS Sponsored Projects
☆14May 14, 2026Updated 2 months ago
cloneofsimo / minRF
View on GitHub
Minimal implementation of scalable rectified flow transformers, based on SD3's approach
☆640Jul 1, 2024Updated 2 years ago
goodfire-ai / sdxl-turbo-interpretability
View on GitHub
☆49May 27, 2025Updated last year
cloneofsimo / min-max-gpt
View on GitHub
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆132Apr 17, 2024Updated 2 years ago
ToyotaResearchInstitute / gradient-estimation-sampler
View on GitHub
Code for the paper "Interpreting and Improving Diffusion Models from an Optimization Perspective", appearing in ICML 2024
☆15Sep 30, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
NMS05 / DinoV2-BERT-CLIP
View on GitHub
A simple PyTorch implementation of CLIP model using DinoV2 and BERT
☆16Sep 26, 2023Updated 2 years ago
astro-informatics / QuantifAI
View on GitHub
PyTorch-based radio-interferometric imaging reconstruction package with scalable Bayesian uncertainty quantification relying on data-driv…
☆12Feb 17, 2025Updated last year
apple / ml-ademamix
View on GitHub
☆71Nov 15, 2024Updated last year
luongthecong123 / fp8-quant-matmul
View on GitHub
Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.
☆19Feb 9, 2026Updated 5 months ago
fal-ai-community / NativeSparseAttention
View on GitHub
research impl of Native Sparse Attention (2502.11089)
☆62Feb 19, 2025Updated last year
tilde-research / activault
View on GitHub
Engine for collecting, uploading, and downloading model activations
☆30Apr 2, 2025Updated last year
uzaymacar / blackjack-with-gui
View on GitHub
A Blackjack game with GUI written in Java.
☆11Nov 21, 2018Updated 7 years ago