Separius / awesome-fast-attentionLinks
list of efficient attention modules
☆1,019Updated 4 years ago
Alternatives and similar repositories for awesome-fast-attention
Users that are interested in awesome-fast-attention are comparing it to the libraries listed below
Sorting:
- Source code for "On the Relationship between Self-Attention and Convolutional Layers"☆1,116Updated 2 years ago
- Some tricks of pytorch...☆1,195Updated last year
- label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful☆2,257Updated last year
- PyTorch implementation of Contrastive Learning methods☆1,996Updated 2 years ago
- Gradually-Warmup Learning Rate Scheduler for PyTorch☆993Updated last year
- A PyTorch Implementation of Focal Loss.☆991Updated 6 years ago
- A comprehensive list of awesome contrastive self-supervised learning papers.☆1,303Updated last year
- A curated list of resources for Learning with Noisy Labels☆2,720Updated 7 months ago
- some tircks for PyTorch☆576Updated 5 years ago
- A multi-task learning example for the paper https://arxiv.org/abs/1705.07115☆867Updated 5 years ago
- Debug PyTorch code using PySnooper☆802Updated 4 years ago
- My best practice of training large dataset using PyTorch.☆1,107Updated last year
- A quickstart and benchmark for pytorch distributed training.☆1,665Updated last year
- Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase☆1,207Updated last year
- [ICLR 2020] Lite Transformer with Long-Short Range Attention☆610Updated last year
- Unofficial implementation of: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics☆563Updated 4 years ago
- ☆882Updated last year
- Pytorch implementation of the paper "Class-Balanced Loss Based on Effective Number of Samples"☆804Updated last year
- pytorch memory track code☆1,016Updated 4 years ago
- A curated list of Multimodal Related Research.☆1,378Updated 2 years ago
- [arXiv 2019] "Contrastive Multiview Coding", also contains implementations for MoCo and InstDis☆1,334Updated 5 years ago
- DeLighT: Very Deep and Light-Weight Transformers☆468Updated 5 years ago
- Learning Rate Warmup in PyTorch☆414Updated 5 months ago
- My take on a practical implementation of Linformer for Pytorch.☆421Updated 3 years ago
- Sublinear memory optimization for deep learning. https://arxiv.org/abs/1604.06174☆605Updated 5 years ago
- Implementation of LambdaNetworks, a new approach to image recognition that reaches SOTA with less compute☆1,530Updated 5 years ago
- Reformer, the efficient Transformer, in Pytorch☆2,189Updated 2 years ago
- An All-MLP solution for Vision, from Google AI☆1,053Updated 4 months ago
- This repository contains code for the paper "Decoupling Representation and Classifier for Long-Tailed Recognition", published at ICLR 202…☆976Updated 4 years ago
- The implementation of "End-to-End Multi-Task Learning with Attention" [CVPR 2019].☆724Updated 3 years ago