lucidrains / lion-pytorchLinks
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch
☆2,161Updated 9 months ago
Alternatives and similar repositories for lion-pytorch
Users that are interested in lion-pytorch are comparing it to the libraries listed below
Sorting:
- Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models☆795Updated 3 months ago
- maximal update parametrization (µP)☆1,594Updated last year
- Foundation Architecture for (M)LLMs☆3,112Updated last year
- Machine learning metrics for distributed, scalable PyTorch applications.☆2,333Updated this week
- SAM: Sharpness-Aware Minimization (PyTorch)☆1,912Updated last year
- A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.☆2,799Updated 2 months ago
- The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”☆969Updated last year
- TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.☆1,648Updated last week
- Schedule-Free Optimization in PyTorch☆2,206Updated 3 months ago
- Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.☆3,110Updated 3 months ago
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆746Updated last month
- D-Adaptation for SGD, Adam and AdaGrad☆523Updated 7 months ago
- A simple way to keep track of an Exponential Moving Average (EMA) version of your Pytorch model☆601Updated 9 months ago
- Structured state space sequence models☆2,718Updated last year
- Tensors, for human consumption☆1,287Updated 2 months ago
- FFCV: Fast Forward Computer Vision (and other ML workloads!)☆2,961Updated last year
- Transformer based on a variant of attention that is linear complexity in respect to sequence length☆797Updated last year
- torchview: visualize pytorch models☆986Updated 3 months ago
- View model summaries in PyTorch!☆2,849Updated this week
- Code release for ConvNeXt V2 model☆1,824Updated last year
- A method to increase the speed and lower the memory footprint of existing vision transformers.☆1,089Updated last year
- Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch☆1,175Updated last year
- ☆785Updated 2 weeks ago
- Pytorch library for fast transformer implementations☆1,727Updated 2 years ago
- Cramming the training of a (BERT-type) language model into limited compute.☆1,348Updated last year
- Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch☆1,171Updated 2 years ago
- Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch☆1,261Updated 2 years ago
- EVA Series: Visual Representation Fantasies from BAAI☆2,560Updated last year
- torch-optimizer -- collection of optimizers for Pytorch☆3,136Updated last year
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…☆1,583Updated last year