ethansmith2000 / fsdp_optimizers
supporting pytorch FSDP for optimizers
☆35Updated this week
Related projects ⓘ
Alternatives and complementary repositories for fsdp_optimizers
- ☆18Updated last month
- ☆73Updated 4 months ago
- Efficient optimizers☆87Updated this week
- An implementation of PSGD Kron second-order optimizer for PyTorch☆16Updated this week
- Utilities for PyTorch distributed☆23Updated last year
- ☆19Updated 2 weeks ago
- ☆20Updated last year
- ☆49Updated 8 months ago
- Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers☆59Updated 4 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆29Updated 3 weeks ago
- ☆31Updated 2 months ago
- PyTorch interface for TrueGrad Optimizers☆39Updated last year
- Train vision models using JAX and 🤗 transformers☆95Updated last month
- HomebrewNLP in JAX flavour for maintable TPU-Training☆46Updated 10 months ago
- Automatically take good care of your preemptible TPUs☆32Updated last year
- Collection of autoregressive model implementation☆67Updated this week
- ☆77Updated 5 months ago
- Latent Diffusion Language Models☆67Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆84Updated this week
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆113Updated 7 months ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆68Updated 3 months ago
- ☆53Updated 10 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆85Updated 2 months ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆14Updated last year
- An implementation of the Llama architecture, to instruct and delight☆21Updated 3 months ago
- ☆129Updated last week
- ☆77Updated 7 months ago
- ☆13Updated 4 months ago
- WIP☆89Updated 3 months ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆44Updated 2 weeks ago