evanatyourservice / kron_torch
An implementation of PSGD Kron second-order optimizer for PyTorch
☆16Updated this week
Related projects ⓘ
Alternatives and complementary repositories for kron_torch
- ☆18Updated last month
- ☆31Updated 2 months ago
- Efficient optimizers☆79Updated this week
- ☆26Updated 6 months ago
- ☆128Updated this week
- The 2D discrete wavelet transform for JAX☆38Updated last year
- Utilities for PyTorch distributed☆23Updated last year
- ☆21Updated 5 months ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆69Updated 3 months ago
- Automatically take good care of your preemptible TPUs☆32Updated last year
- FID computation in Jax/Flax.☆24Updated 4 months ago
- A repo based on XiLin Li's PSGD repo that extends some of the experiments.☆14Updated last month
- An implementation of the Llama architecture, to instruct and delight☆21Updated 3 months ago
- Implementation of Diffusion Transformers and Rectified Flow in Jax☆20Updated 4 months ago
- A JAX implementation of the continuous time formulation of Consistency Models☆83Updated last year
- Experiment of using Tangent to autodiff triton☆72Updated 9 months ago
- ☆50Updated 10 months ago
- PyTorch interface for TrueGrad Optimizers☆39Updated last year
- Latent Diffusion Language Models☆67Updated last year
- ☆16Updated 2 months ago
- ☆19Updated last week
- ☆46Updated last month
- ☆48Updated this week
- [ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)☆13Updated 2 weeks ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆43Updated last year
- ☆73Updated 4 months ago
- Implementation of PSGD optimizer in JAX☆17Updated last week
- ☆53Updated 10 months ago
- A State-Space Model with Rational Transfer Function Representation.☆70Updated 6 months ago