lucidrains / lion-pytorchLinks

🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch

☆2,150

Alternatives and similar repositories for lion-pytorch

Users that are interested in lion-pytorch are comparing it to the libraries listed below

Sorting:

microsoft / mup
maximal update parametrization (µP)
☆1,569Updated last year
sail-sg / Adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
☆797Updated last month
facebookresearch / dadaptation
D-Adaptation for SGD, Adam and AdaGrad
☆524Updated 6 months ago
Lightning-AI / torchmetrics
Machine learning metrics for distributed, scalable PyTorch applications.
☆2,316Updated last week
lucidrains / rotary-embedding-torch
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
☆719Updated last week
webdataset / webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
☆2,736Updated last month
lucidrains / ema-pytorch
A simple way to keep track of an Exponential Moving Average (EMA) version of your Pytorch model
☆600Updated 7 months ago
microsoft / torchscale
Foundation Architecture for (M)LLMs
☆3,097Updated last year
google-research / big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
☆3,044Updated 2 months ago
mert-kurttutan / torchview
torchview: visualize pytorch models
☆969Updated 2 months ago
lucidrains / x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
☆5,474Updated this week
facebookresearch / schedule_free
Schedule-Free Optimization in PyTorch
☆2,193Updated 2 months ago
idiap / fast-transformers
Pytorch library for fast transformer implementations
☆1,725Updated 2 years ago
Liuhong99 / Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
☆965Updated last year
JonasGeiping / cramming
Cramming the training of a (BERT-type) language model into limited compute.
☆1,339Updated last year
TylerYep / torchinfo
View model summaries in PyTorch!
☆2,827Updated last week
lucidrains / vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
☆3,451Updated last week
xl0 / lovely-tensors
Tensors, for human consumption
☆1,271Updated last month
davda54 / sam
SAM: Sharpness-Aware Minimization (PyTorch)
☆1,906Updated last year
libffcv / ffcv
FFCV: Fast Forward Computer Vision (and other ML workloads!)
☆2,952Updated last year
facebookresearch / multimodal
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
☆1,635Updated this week
facebookresearch / ToMe
A method to increase the speed and lower the memory footprint of existing vision transformers.
☆1,079Updated last year
lucidrains / linear-attention-transformer
Transformer based on a variant of attention that is linear complexity in respect to sequence length
☆789Updated last year
state-spaces / s4
Structured state space sequence models
☆2,695Updated last year
facebookresearch / ConvNeXt-V2
Code release for ConvNeXt V2 model
☆1,801Updated 11 months ago
lucidrains / flamingo-pytorch
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
☆1,256Updated 2 years ago
lucidrains / perceiver-pytorch
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch
☆1,162Updated last year
facebookresearch / fairscale
PyTorch extensions for high performance and large scale training.
☆3,350Updated 3 months ago
changjonathanc / minLoRA
minLoRA: a minimal PyTorch library that allows you to apply LoRA to any PyTorch model.
☆469Updated 2 years ago
lucidrains / CoCa-pytorch
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
☆1,162Updated last year