cloneofsimo/zeroshampoo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cloneofsimo/zeroshampoo)

cloneofsimo / zeroshampoo

☆33

Alternatives and similar repositories for zeroshampoo

Users that are interested in zeroshampoo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fal-ai / lavender-data
View on GitHub
Load & manage evolving datasets efficiently
☆22Aug 22, 2025Updated 10 months ago
ethansmith2000 / fsdp_optimizers
View on GitHub
supporting pytorch FSDP for optimizers
☆84Dec 8, 2024Updated last year
cloneofsimo / min-fsdp
View on GitHub
☆93Jul 5, 2024Updated 2 years ago
fal-ai-community / NativeSparseAttention
View on GitHub
research impl of Native Sparse Attention (2502.11089)
☆62Feb 19, 2025Updated last year
cloneofsimo / ptar
View on GitHub
☆13Jun 3, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
cloneofsimo / scaling-guide
View on GitHub
WIP
☆96Aug 13, 2024Updated last year
cloneofsimo / efae
View on GitHub
☆24Jun 18, 2024Updated 2 years ago
cloneofsimo / repa-rf
View on GitHub
☆32Nov 4, 2024Updated last year
junhahyung / MagiCapture
View on GitHub
☆11Feb 26, 2024Updated 2 years ago
timlautk / polargrad
View on GitHub
PolarGrad: A Class of Matrix-Gradient Optimizers from a Unifying Preconditioning Perspective
☆18Oct 1, 2025Updated 9 months ago
cloneofsimo / karras-power-ema-tutorial
View on GitHub
☆53Jan 6, 2024Updated 2 years ago
cloneofsimo / project_RF
View on GitHub
☆24Jun 4, 2024Updated 2 years ago
google-deepmind / asyncdiloco
View on GitHub
☆51Jan 18, 2024Updated 2 years ago
boweiliu / nccl
View on GitHub
Optimized primitives for collective multi-GPU communication
☆11May 8, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
swairshah / Intensify
View on GitHub
coloring terminal text with intensities (used for plotting probability, entropy with tokens)
☆12Oct 11, 2024Updated last year
cloneofsimo / ezmup
View on GitHub
Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam
☆88Jul 28, 2024Updated last year
crowsonkb / jax-wavelets
View on GitHub
The 2D discrete wavelet transform for JAX
☆45Feb 28, 2023Updated 3 years ago
SDLAML / disco
View on GitHub
☆16Dec 11, 2025Updated 7 months ago
fal-ai / diffusion-speedrun
View on GitHub
Focused on fast experimentation and simplicity
☆77Dec 24, 2024Updated last year
edwardmilsom / function-space-learning-rates-paper
View on GitHub
Code for the paper "Function-Space Learning Rates"
☆23Jun 3, 2025Updated last year
stepelu / idbm-pytorch
View on GitHub
☆13Sep 13, 2023Updated 2 years ago
smpanaro / apple-silicon-4bit-quant
View on GitHub
Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"
☆11Mar 31, 2024Updated 2 years ago
kyleliang919 / Super_Muon
View on GitHub
☆68Mar 21, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
HomebrewML / HeavyBall
View on GitHub
Efficient optimizers
☆335Jul 11, 2026Updated last week
cloneofsimo / vqgan-training
View on GitHub
Train VAE like a boss
☆313Oct 21, 2024Updated last year
cloneofsimo / minDinoV2
View on GitHub
☆24Oct 15, 2024Updated last year
angrave / CS341-Lectures-SP24
View on GitHub
CS341 for Spring 2024
☆11Jul 15, 2024Updated 2 years ago
Guitaricet / my_pefty_llama
View on GitHub
Minimal implementation of multiple PEFT methods for LLaMA fine-tuning
☆13May 7, 2023Updated 3 years ago
cloneofsimo / min-max-gpt
View on GitHub
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆132Apr 17, 2024Updated 2 years ago
VITA-Group / ChainCoder
View on GitHub
[ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …
☆43Nov 9, 2023Updated 2 years ago
fal-ai-community / nano-mdm
View on GitHub
Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun
☆57Mar 10, 2025Updated last year
cloneofsimo / minSAE
View on GitHub
☆30Dec 2, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
srush / tangent
View on GitHub
Source-to-Source Debuggable Derivatives in Pure Python
☆15Jan 23, 2024Updated 2 years ago
OpenNLPLab / HGRN2
View on GitHub
HGRN2: Gated Linear RNNs with State Expansion
☆58Aug 20, 2024Updated last year
foundation-model-stack / fms-fsdp
View on GitHub
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…
☆288Nov 24, 2025Updated 7 months ago
Vchitect / Optix
View on GitHub
Memory Efficient Training Framework for Large Video Generation Model
☆25Apr 22, 2024Updated 2 years ago
argolab / dyna-R
View on GitHub
Dyna built on R-exprs (First Prototype)
☆17Mar 7, 2022Updated 4 years ago
lessw2020 / transformer_central
View on GitHub
Various transformers for FSDP research
☆38Nov 11, 2022Updated 3 years ago
lucidrains / autoregressive-linear-attention-cuda
View on GitHub
CUDA implementation of autoregressive linear attention, with all the latest research findings
☆46May 23, 2023Updated 3 years ago