facebookresearch / optimizers
For optimization algorithm research and development.
☆498Updated this week
Alternatives and similar repositories for optimizers:
Users that are interested in optimizers are comparing it to the libraries listed below
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆370Updated this week
- TensorDict is a pytorch dedicated tensor container.☆898Updated this week
- Implementation of Diffusion Transformer (DiT) in JAX☆269Updated 9 months ago
- Annotated version of the Mamba paper☆475Updated last year
- Efficient optimizers☆183Updated 2 weeks ago
- ☆301Updated 9 months ago
- Helpful tools and examples for working with flex-attention☆689Updated last week
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆524Updated last month
- ☆420Updated 5 months ago
- Universal Tensor Operations in Einstein-Inspired Notation for Python.☆361Updated last month
- Scalable and Performant Data Loading☆230Updated this week
- ☆214Updated 8 months ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆219Updated 2 weeks ago
- ☆169Updated 3 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆557Updated this week
- Puzzles for exploring transformers☆333Updated last year
- A Jax-based library for designing and training transformer models from scratch.☆282Updated 6 months ago
- Implementation of https://srush.github.io/annotated-s4☆485Updated 2 years ago
- The AdEMAMix Optimizer: Better, Faster, Older.☆179Updated 6 months ago
- Muon optimizer: +>30% sample efficiency with <3% wallclock overhead☆505Updated last week
- Best practices & guides on how to write distributed pytorch training code☆368Updated 3 weeks ago
- Named tensors with first-class dimensions for PyTorch☆321Updated last year
- What would you do with 1000 H100s...☆1,016Updated last year
- jax-triton contains integrations between JAX and OpenAI Triton☆384Updated last week
- 🧱 Modula software package☆173Updated 2 weeks ago
- Library for reading and processing ML training data.☆407Updated this week
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆276Updated this week
- Accelerated First Order Parallel Associative Scan☆175Updated 7 months ago
- A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to fac…☆229Updated 2 months ago
- Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"☆419Updated 3 months ago