Separius / awesome-fast-attentionLinks
list of efficient attention modules
☆1,013Updated 3 years ago
Alternatives and similar repositories for awesome-fast-attention
Users that are interested in awesome-fast-attention are comparing it to the libraries listed below
Sorting:
- Some tricks of pytorch...☆1,194Updated last year
- Source code for "On the Relationship between Self-Attention and Convolutional Layers"☆1,106Updated 2 years ago
- A quickstart and benchmark for pytorch distributed training.☆1,668Updated last year
- Debug PyTorch code using PySnooper☆797Updated 4 years ago
- Gradually-Warmup Learning Rate Scheduler for PyTorch☆991Updated 9 months ago
- A PyTorch Implementation of Focal Loss.☆985Updated 5 years ago
- Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase☆1,202Updated last year
- knowledge distillation papers☆758Updated 2 years ago
- pytorch memory track code☆1,017Updated 4 years ago
- ☆881Updated last year
- Unofficial implementation of: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics☆555Updated 3 years ago
- PyTorch implementation of Contrastive Learning methods☆1,988Updated last year
- label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful☆2,245Updated 9 months ago
- some tircks for PyTorch☆576Updated 5 years ago
- A comprehensive list of awesome contrastive self-supervised learning papers.☆1,284Updated 10 months ago
- My best practice of training large dataset using PyTorch.☆1,100Updated last year
- Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。☆2,614Updated 2 years ago
- DeLighT: Very Deep and Light-Weight Transformers☆470Updated 4 years ago
- Implementation of LambdaNetworks, a new approach to image recognition that reaches SOTA with less compute☆1,533Updated 4 years ago
- An implementation of Performer, a linear attention-based transformer, in Pytorch☆1,142Updated 3 years ago
- My take on a practical implementation of Linformer for Pytorch.☆417Updated 3 years ago
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆612Updated last year
- An All-MLP solution for Vision, from Google AI☆1,034Updated 3 weeks ago
- Reformer, the efficient Transformer, in Pytorch☆2,177Updated 2 years ago
- A curated list of resources for Learning with Noisy Labels☆2,694Updated 3 months ago
- torchsummaryX: Improved visualization tool of torchsummary☆303Updated 3 years ago
- Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, Auxiliary Tasks in Multi-task Learning☆633Updated 5 years ago
- Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆410Updated last year
- Pytorch library for fast transformer implementations☆1,725Updated 2 years ago
- Sublinear memory optimization for deep learning. https://arxiv.org/abs/1604.06174☆599Updated 5 years ago