facebookresearch / optimizersLinks

For optimization algorithm research and development.

☆525

Alternatives and similar repositories for optimizers

Users that are interested in optimizers are comparing it to the libraries listed below

Sorting:

srush / annotated-mamba
Annotated version of the Mamba paper
☆487Updated last year
mlcommons / algorithmic-efficiency
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…
☆389Updated this week
pytorch / tensordict
TensorDict is a pytorch dedicated tensor container.
☆949Updated this week
facebookresearch / spdl
Scalable and Performant Data Loading
☆291Updated this week
kvfrans / jax-diffusion-transformer
Implementation of Diffusion Transformer (DiT) in JAX
☆286Updated last year
fferflo / einx
Universal Tensor Operations in Einstein-Inspired Notation for Python.
☆392Updated 3 months ago
HomebrewML / HeavyBall
Efficient optimizers
☆252Updated last week
apple / ml-sigma-reparam
☆304Updated last year
google-deepmind / nanodo
☆275Updated last year
KellerJordan / cifar10-airbench
CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds
☆274Updated 2 weeks ago
stanford-crfm / levanter
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆630Updated this week
HenryNdubuaku / nanodl
A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.
☆290Updated 11 months ago
BobMcDear / attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆565Updated this week
facebookresearch / torchdim
Named tensors with first-class dimensions for PyTorch
☆332Updated 2 years ago
google / grain
Library for reading and processing ML training data.
☆487Updated this week
LambdaLabsML / distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
☆460Updated 5 months ago
nanowell / AdEMAMix-Optimizer-Pytorch
The AdEMAMix Optimizer: Better, Faster, Older.
☆184Updated 10 months ago
srush / Transformer-Puzzles
Puzzles for exploring transformers
☆356Updated 2 years ago
nikhilvyas / SOAP
☆206Updated 8 months ago
srush / annotated-s4
Implementation of https://srush.github.io/annotated-s4
☆500Updated last month
pytorch-labs / attention-gym
Helpful tools and examples for working with flex-attention
☆904Updated 2 weeks ago
pytorch-labs / monarch
PyTorch Single Controller
☆341Updated last week
modula-systems / modula
🧱 Modula software package
☆210Updated last week
srush / Autodiff-Puzzles
☆443Updated 9 months ago
jax-ml / jax-triton
jax-triton contains integrations between JAX and OpenAI Triton
☆411Updated last month
iShohei220 / adopt
Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"
☆429Updated 7 months ago
jax-ml / scaling-book
Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs
☆445Updated this week
lucidrains / flash-attention-jax
Implementation of Flash Attention in Jax
☆215Updated last year
Lightning-AI / litData
Transform datasets at scale. Optimize datasets for fast AI model training.
☆516Updated this week
clu0 / unet.cu
UNet diffusion model in pure CUDA
☆613Updated last year