kvfrans/matrix-whitening

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kvfrans/matrix-whitening)

kvfrans / matrix-whitening

Code for "What really matters in matrix-whitening optimizers?"

☆25

Alternatives and similar repositories for matrix-whitening

Users that are interested in matrix-whitening are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GallagherCommaJack / modulax
View on GitHub
☆18Aug 24, 2024Updated last year
pd-perry / TQL
View on GitHub
☆28May 11, 2026Updated 2 months ago
cat-state / modded-nanogpt-moe
View on GitHub
☆17Sep 6, 2025Updated 10 months ago
kvfrans / jaxtransformer
View on GitHub
Minimal Transformer base in JAX. A single backbone for language modelling, diffusion, classification, etc...
☆16May 28, 2025Updated last year
Jaykef / Triton-nanoGPT
View on GitHub
Custom triton kernels for training Karpathy's nanoGPT.
☆19Oct 21, 2024Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
ethansmith2000 / TransformerExperiments
View on GitHub
☆19Dec 4, 2025Updated 7 months ago
apple / ml-scalefit
View on GitHub
☆18Mar 3, 2026Updated 4 months ago
facebookresearch / scalable-curvature
View on GitHub
Code for Dayal Kalra's research internship on scalable curvature measures for neural networks.
☆29Feb 3, 2026Updated 5 months ago
HumanCompatibleAI / leela-interp
View on GitHub
Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
☆31Jun 4, 2024Updated 2 years ago
drbh / yamoe
View on GitHub
🔀 yet another mixture of experts
☆23Jun 5, 2026Updated last month
fangyuan-ksgk / selective-attention-transformer
View on GitHub
Unofficial Implementation of Selective Attention Transformer
☆20Oct 31, 2024Updated last year
LIONS-EPFL / scion
View on GitHub
☆70Apr 8, 2026Updated 3 months ago
phillipi / plot-net
View on GitHub
Tools for visualizing neural nets
☆19Jul 29, 2025Updated 11 months ago
kvfrans / splus
View on GitHub
☆127Jun 11, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
nikhilvyas / SOAP
View on GitHub
☆273Dec 2, 2024Updated last year
Dao-AILab / gemm-cublas
View on GitHub
☆22May 5, 2025Updated last year
zyushun / hessian-spectrum
View on GitHub
Code for the paper: Why Transformers Need Adam: A Hessian Perspective
☆65Mar 11, 2025Updated last year
epfml / llm-optimizer-benchmark
View on GitHub
Benchmarking Optimizers for LLM Pretraining
☆60May 3, 2026Updated 2 months ago
Non-Contradiction / convexjlr
View on GitHub
Disciplined Convex Programming in R using Convex.jl.
☆14Dec 18, 2018Updated 7 years ago
fattorib / Flax-ResNets
View on GitHub
CIFAR10 ResNets implemented in JAX+Flax
☆12Apr 6, 2022Updated 4 years ago
cloneofsimo / efae
View on GitHub
☆24Jun 18, 2024Updated 2 years ago
tilde-research / one-layer-deeper
View on GitHub
☆29Updated this week
rosieyzh / openrlhf-pretrain
View on GitHub
Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"
☆29Oct 14, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
NoahAmsel / PolarExpress
View on GitHub
☆32Jul 6, 2026Updated 2 weeks ago
Z-T-WANG / LaProp-Optimizer
View on GitHub
Codes accompanying the paper "LaProp: a Better Way to Combine Momentum with Adaptive Gradient"
☆31Jul 30, 2020Updated 5 years ago
evanatyourservice / psgd_jax
View on GitHub
Implementation of PSGD optimizer in JAX
☆36Dec 31, 2024Updated last year
LCS2-IIITD / DaSLaM
View on GitHub
☆17Oct 31, 2023Updated 2 years ago
kvfrans / lmpo
View on GitHub
☆141Dec 9, 2025Updated 7 months ago
sail-sg / ContinualBench
View on GitHub
☆25May 20, 2025Updated last year
fiveai / understanding_safety_finetuning
View on GitHub
Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)
☆12Oct 31, 2024Updated last year
gallego-posada / constrained_sparsity
View on GitHub
Official implementation for the paper "Controlled Sparsity via Constrained Optimization"
☆12Aug 10, 2022Updated 3 years ago
fal-ai-community / NativeSparseAttention
View on GitHub
research impl of Native Sparse Attention (2502.11089)
☆62Feb 19, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
kaloureyes3 / v4-clients
View on GitHub
☆10Apr 5, 2024Updated 2 years ago
cloneofsimo / repa-rf
View on GitHub
☆32Nov 4, 2024Updated last year
evanatyourservice / llm-jax
View on GitHub
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆19Jul 24, 2025Updated 11 months ago
Overworldai / owl-vaes
View on GitHub
Weird autoencoder experiments
☆24May 20, 2026Updated 2 months ago
jopetty / word-problem
View on GitHub
Experiments on the impact of depth in transformers and SSMs.
☆44Oct 23, 2025Updated 8 months ago
kyunghyuncho / jax-practice
View on GitHub
☆13Aug 17, 2020Updated 5 years ago
radfordneal / plotutils
View on GitHub
Plotutils+ is a fork of the GNU plotutils package, with fixes and extensions
☆13May 14, 2024Updated 2 years ago