sooftware / attentions
PyTorch implementation of some attentions for Deep Learning Researchers.
☆520Updated 2 years ago
Alternatives and similar repositories for attentions:
Users that are interested in attentions are comparing it to the libraries listed below
- An implementation of Performer, a linear attention-based transformer, in Pytorch☆1,110Updated 2 years ago
- Transformer based on a variant of attention that is linear complexity in respect to sequence length☆724Updated 8 months ago
- Pytorch library for fast transformer implementations☆1,665Updated last year
- Flexible components pairing 🤗 Transformers with Pytorch Lightning☆613Updated 2 years ago
- Reformer, the efficient Transformer, in Pytorch☆2,140Updated last year
- Implementation of Linformer for Pytorch☆262Updated last year
- My take on a practical implementation of Linformer for Pytorch.☆409Updated 2 years ago
- An implementation of local windowed attention for language modeling☆403Updated this week
- Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch☆426Updated 3 years ago
- Learning Rate Warmup in PyTorch☆399Updated this week
- ☆446Updated last year
- Pytorch Lightning code guideline for conferences☆1,245Updated last year
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆609Updated last month
- A simple and working implementation of Electra, the fastest way to pretrain language models from scratch, in Pytorch☆221Updated last year
- Early stopping for PyTorch☆1,238Updated 2 months ago
- An All-MLP solution for Vision, from Google AI☆1,007Updated 4 months ago
- An implementation of masked language modeling for Pytorch, made as concise and simple as possible☆179Updated last year
- Long Range Arena for Benchmarking Efficient Transformers☆739Updated last year
- Transformer implementation in PyTorch.☆474Updated 5 years ago
- Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms☆256Updated 3 years ago
- Contrastive Predictive Coding for Automatic Speaker Verification☆484Updated 5 years ago
- ☆64Updated 4 years ago
- The entmax mapping and its loss, a family of sparse softmax alternatives.☆419Updated 6 months ago
- Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch☆1,115Updated last year
- The Noise Contrastive Estimation for softmax output written in Pytorch☆318Updated 5 years ago
- Fully featured implementation of Routing Transformer☆288Updated 3 years ago
- Implementation of Transformer encoder in PyTorch☆65Updated 4 years ago
- Longformer: The Long-Document Transformer☆2,072Updated last year
- Standalone TFRecord reader/writer with PyTorch data loaders☆872Updated 4 months ago
- kmeans using PyTorch☆499Updated last year