facebookresearch / optimizers
For optimization algorithm research and development.
☆491Updated this week
Alternatives and similar repositories for optimizers:
Users that are interested in optimizers are comparing it to the libraries listed below
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆364Updated last week
- ☆299Updated 7 months ago
- TensorDict is a pytorch dedicated tensor container.☆877Updated this week
- ☆208Updated 7 months ago
- Annotated version of the Mamba paper☆473Updated 11 months ago
- Implementation of Diffusion Transformer (DiT) in JAX☆265Updated 8 months ago
- Library for reading and processing ML training data.☆385Updated this week
- Scalable and Performant Data Loading☆217Updated this week
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆514Updated this week
- Helpful tools and examples for working with flex-attention☆635Updated this week
- ☆158Updated 2 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆542Updated this week
- ☆416Updated 4 months ago
- Universal Tensor Operations in Einstein-Inspired Notation for Python.☆353Updated last week
- Muon optimizer: +~30% sample efficiency with <3% wallclock overhead☆253Updated last week
- Efficient optimizers☆169Updated this week
- Minimalistic 4D-parallelism distributed training framework for education purpose☆724Updated this week
- A Jax-based library for designing and training transformer models from scratch.☆281Updated 5 months ago
- Named tensors with first-class dimensions for PyTorch☆321Updated last year
- The AdEMAMix Optimizer: Better, Faster, Older.☆177Updated 5 months ago
- Orbax provides common checkpointing and persistence utilities for JAX users☆338Updated this week
- Implementation of https://srush.github.io/annotated-s4☆482Updated 2 years ago
- Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"☆416Updated 2 months ago
- jax-triton contains integrations between JAX and OpenAI Triton☆379Updated 3 weeks ago
- Best practices & guides on how to write distributed pytorch training code☆351Updated 3 weeks ago
- Unofficial JAX implementations of deep learning research papers☆153Updated 2 years ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆205Updated this week
- ☆142Updated last year
- Transform datasets at scale. Optimize datasets for fast AI model training.☆413Updated this week
- Implementation of Flash Attention in Jax☆204Updated 11 months ago