lessw2020 / Ranger22Links
Testing various improvements to Ranger21 for 2022
☆19Updated last year
Alternatives and similar repositories for Ranger22
Users that are interested in Ranger22 are comparing it to the libraries listed below
Sorting:
- Axial Positional Embedding for Pytorch☆84Updated 11 months ago
- Implementation of Nyström Self-attention, from the paper Nyströmformer☆145Updated 10 months ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆220Updated 2 years ago
- Another attempt at a long-context / efficient transformer by me☆38Updated 3 years ago
- Implementation of LogAvgExp for Pytorch☆37Updated 9 months ago
- Layerwise Batch Entropy Regularization☆24Updated 3 years ago
- Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI☆98Updated 4 years ago
- Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…☆51Updated 3 years ago
- ☆41Updated 4 years ago
- A collection of optimizers, some arcane others well known, for Flax.☆29Updated 4 years ago
- ☆47Updated 3 years ago
- ☆75Updated 3 years ago
- Framework for creating (partially) reversible neural networks with PyTorch☆156Updated 3 years ago
- Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload☆132Updated 3 years ago
- A convolution-free, transformer-only version of the CycleGAN framework☆33Updated 3 years ago
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆116Updated 3 years ago
- An open source implementation of CLIP.☆33Updated 3 years ago
- PyTorch reimplementation of the paper "HyperMixer: An MLP-based Green AI Alternative to Transformers" [arXiv 2022].☆18Updated 3 years ago
- ImageNet-12k subset of ImageNet-21k (fall11)☆21Updated 2 years ago
- Implementation of Metaformer, but in an autoregressive manner☆26Updated 3 years ago
- ☆33Updated 2 years ago
- A dashboard for exploring timm learning rate schedulers☆19Updated last year
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆207Updated 2 years ago
- PyTorch interface for TrueGrad Optimizers☆43Updated 2 years ago
- Utilities for PyTorch distributed☆25Updated 11 months ago
- FID computation in Jax/Flax.☆29Updated last year
- My explorations into editing the knowledge and memories of an attention network☆35Updated 3 years ago
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆57Updated 3 years ago
- Code repository of the paper "Modelling Long Range Dependencies in ND: From Task-Specific to a General Purpose CNN" https://arxiv.org/abs…☆183Updated 8 months ago
- Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper☆81Updated 4 years ago