sail-sg / AdanLinks

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

☆799

Alternatives and similar repositories for Adan

Users that are interested in Adan are comparing it to the libraries listed below

Sorting:

lucidrains / ema-pytorch
A simple way to keep track of an Exponential Moving Average (EMA) version of your Pytorch model
☆616Updated 10 months ago
locuslab / convmixer
Implementation of ConvMixer for "Patches Are All You Need? 🤷"
☆1,077Updated 2 years ago
SHI-Labs / Neighborhood-Attention-Transformer
Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
☆1,146Updated last year
sail-sg / poolformer
PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
☆1,355Updated last year
lucidrains / linear-attention-transformer
Transformer based on a variant of attention that is linear complexity in respect to sequence length
☆801Updated last year
xxxnell / how-do-vits-work
(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
☆819Updated 3 years ago
facebookresearch / ToMe
A method to increase the speed and lower the memory footprint of existing vision transformers.
☆1,110Updated last year
lucidrains / mlp-mixer-pytorch
An All-MLP solution for Vision, from Google AI
☆1,048Updated 3 months ago
google-research / vmoe
☆682Updated 2 months ago
fadel / pytorch_ema
Tiny PyTorch library for maintaining a moving average of a collection of parameters.
☆438Updated last year
google-research / pix2seq
Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
☆930Updated last year
lucidrains / lion-pytorch
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch
☆2,166Updated 10 months ago
lucidrains / memory-efficient-attention-pytorch
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
☆383Updated 2 years ago
lucidrains / rotary-embedding-torch
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
☆766Updated 2 months ago
snap-research / EfficientFormer
EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]
☆1,081Updated 2 years ago
Tony-Y / pytorch_warmup
Learning Rate Warmup in PyTorch
☆413Updated 4 months ago
lucidrains / x-clip
A concise but complete implementation of CLIP with various experimental improvements from recent papers
☆716Updated 2 years ago
lucidrains / linformer
Implementation of Linformer for Pytorch
☆299Updated last year
facebookresearch / ConvNeXt-V2
Code release for ConvNeXt V2 model
☆1,853Updated last year
microsoft / FocalNet
[NeurIPS 2022] Official code for "Focal Modulation Networks"
☆741Updated last year
davda54 / sam
SAM: Sharpness-Aware Minimization (PyTorch)
☆1,925Updated last year
google-research / sam
☆605Updated 2 months ago
lessw2020 / Ranger21
Ranger deep learning optimizer rewrite to use newest components
☆338Updated last year
SHI-Labs / Compact-Transformers
Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)
☆536Updated 11 months ago
facebookresearch / moco-v3
PyTorch implementation of MoCo v3 https//arxiv.org/abs/2104.02057
☆1,298Updated 3 years ago
sail-sg / metaformer
MetaFormer Baselines for Vision (TPAMI 2024)
☆492Updated last year
bytedance / ibot
iBOT : Image BERT Pre-Training with Online Tokenizer (ICLR 2022)
☆748Updated 3 years ago
microsoft / Cream
This is a collection of our NAS and Vision Transformer work.
☆1,806Updated last year
google-research / maxvit
[ECCV 2022] Official repository for "MaxViT: Multi-Axis Vision Transformer". SOTA foundation models for classification, detection, segmen…
☆487Updated 2 years ago
Newbeeer / Poisson_flow
Code for NeurIPS 2022 Paper, "Poisson Flow Generative Models" (PFGM)
☆866Updated 2 years ago