microsoft / dionLinks
Dion optimizer algorithm
โ259Updated this week
Alternatives and similar repositories for dion
Users that are interested in dion are comparing it to the libraries listed below
Sorting:
- โ275Updated last year
- ๐งฑ Modula software packageโ216Updated 2 weeks ago
- Simple & Scalable Pretraining for Neural Architecture Researchโ283Updated last week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ149Updated last month
- Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUsโ466Updated this week
- โ353Updated this week
- supporting pytorch FSDP for optimizersโ84Updated 8 months ago
- PyTorch Single Controllerโ345Updated this week
- NanoGPT-speedrunning for the poor T4 enjoyersโ69Updated 3 months ago
- Getting crystal-like representations with harmonic lossโ193Updated 4 months ago
- rl from zero pretrain, can it be done? yes.โ193Updated this week
- Decentralized RL Training at Scaleโ403Updated this week
- A MAD laboratory to improve AI architecture designs ๐งชโ123Updated 7 months ago
- seqax = sequence modeling + JAXโ165Updated 2 weeks ago
- โ174Updated 4 months ago
- โ83Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)โ104Updated 5 months ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 secondsโ275Updated 3 weeks ago
- Efficient optimizersโ253Updated last week
- โ144Updated this week
- โ182Updated this week
- An implementation of PSGD Kron second-order optimizer for PyTorchโ94Updated 2 weeks ago
- โ208Updated 5 months ago
- Normalized Transformer (nGPT)โ186Updated 8 months ago
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)โ66Updated 4 months ago
- SIMD quantization kernelsโ78Updated last week
- Load compute kernels from the Hubโ233Updated this week
- Understand and test language model architectures on synthetic tasks.โ221Updated 3 weeks ago
- Scalable and Performant Data Loadingโ291Updated this week
- Simple Transformer in Jaxโ138Updated last year