cloneofsimo/scaling-guide

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cloneofsimo/scaling-guide)

cloneofsimo / scaling-guide

WIP

☆96

Alternatives and similar repositories for scaling-guide

Users that are interested in scaling-guide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cloneofsimo / ptar
View on GitHub
☆13Jun 3, 2024Updated 2 years ago
cloneofsimo / efae
View on GitHub
☆24Jun 18, 2024Updated 2 years ago
graphcore-research / unit-scaling
View on GitHub
A library for unit scaling in PyTorch
☆134Jul 11, 2025Updated last year
cloneofsimo / minDinoV2
View on GitHub
☆24Oct 15, 2024Updated last year
cloneofsimo / zeroshampoo
View on GitHub
☆33Sep 10, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
fal-ai / diffusion-speedrun
View on GitHub
Focused on fast experimentation and simplicity
☆77Dec 24, 2024Updated last year
cloneofsimo / repa-rf
View on GitHub
☆32Nov 4, 2024Updated last year
edwardmilsom / function-space-learning-rates-paper
View on GitHub
Code for the paper "Function-Space Learning Rates"
☆23Jun 3, 2025Updated last year
cloneofsimo / vqgan-training
View on GitHub
Train VAE like a boss
☆313Oct 21, 2024Updated last year
cloneofsimo / minRF
View on GitHub
Minimal implementation of scalable rectified flow transformers, based on SD3's approach
☆640Jul 1, 2024Updated 2 years ago
GunwooHan / E-LatentLPIPS
View on GitHub
Unofficial Implementation of E-LatentLPIPS(Ensembled-LatentLPIPS) of Diffusion2GAN
☆42Jul 11, 2024Updated 2 years ago
cloneofsimo / min-max-gpt
View on GitHub
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆132Apr 17, 2024Updated 2 years ago
cloneofsimo / min-fsdp
View on GitHub
☆93Jul 5, 2024Updated 2 years ago
CompVis / DisCLIP
View on GitHub
[AAAI 2025] Does VLM Classification Benefit from LLM Description Semantics?
☆26Aug 5, 2025Updated 11 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ethansmith2000 / fsdp_optimizers
View on GitHub
supporting pytorch FSDP for optimizers
☆84Dec 8, 2024Updated last year
nikhilvyas / SOAP
View on GitHub
☆273Dec 2, 2024Updated last year
microsoft / mup
View on GitHub
maximal update parametrization (µP)
☆1,739Jul 17, 2024Updated 2 years ago
cloneofsimo / ezmup
View on GitHub
Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam
☆88Jul 28, 2024Updated last year
cloneofsimo / minSAE
View on GitHub
☆30Dec 2, 2024Updated last year
cloneofsimo / min-max-in-dit
View on GitHub
☆27May 3, 2024Updated 2 years ago
ethansmith2000 / TransformerExperiments
View on GitHub
☆19Dec 4, 2025Updated 7 months ago
cloneofsimo / insightful-nn-papers
View on GitHub
These papers will provide unique insightful concepts that will broaden your perspective on neural networks and deep learning
☆48Sep 3, 2023Updated 2 years ago
HomebrewML / HeavyBall
View on GitHub
Efficient optimizers
☆334Jul 11, 2026Updated last week
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
EleutherAI / nanoGPT-mup
View on GitHub
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆199Jan 19, 2026Updated 6 months ago
modula-systems / modula
View on GitHub
🧱 Modula software package
☆337Aug 18, 2025Updated 11 months ago
BlinkDL / LinearAttentionArena
View on GitHub
Here we will test various linear attention designs.
☆62Apr 25, 2024Updated 2 years ago
OurBluePrint / easy_video
View on GitHub
☆20Mar 3, 2025Updated last year
samblouir / birdie
View on GitHub
☆15Jun 8, 2026Updated last month
sayakpaul / tt-scale-flux
View on GitHub
Inference-time scaling of diffusion-based image and video generation models.
☆174Dec 17, 2025Updated 7 months ago
mmathew23 / improved_edm
View on GitHub
Implementation of "Analyzing and Improving the Training Dynamics of Diffusion Models"
☆96Feb 12, 2024Updated 2 years ago
radarFudan / mamba-minimal-jax
View on GitHub
☆36Nov 22, 2024Updated last year
fal-ai-community / NativeSparseAttention
View on GitHub
research impl of Native Sparse Attention (2502.11089)
☆62Feb 19, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
crowsonkb / jax-wavelets
View on GitHub
The 2D discrete wavelet transform for JAX
☆45Feb 28, 2023Updated 3 years ago
xjdr-alt / mla_blog_translation
View on GitHub
☆13Jun 18, 2024Updated 2 years ago
google-deepmind / nanodo
View on GitHub
☆304Jul 15, 2024Updated 2 years ago
dahyun-kang / cub-200-2011-part-visualizer
View on GitHub
Visualization tool for CUB-200-2011 part keypoints (Wah et al.).
☆10Sep 17, 2021Updated 4 years ago
CompVis / RepTok
View on GitHub
[ICLR 2026] Adapting Self-Supervised Representations as a Latent Space for Efficient Generation
☆59Apr 24, 2026Updated 2 months ago
fal-ai-community / minDDPD
View on GitHub
☆33Jan 6, 2025Updated last year
recursal / GoldFinch-paper
View on GitHub
GoldFinch and other hybrid transformer components
☆46Jul 20, 2024Updated 2 years ago