openai / sparse_attentionLinks

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

☆1,590

Alternatives and similar repositories for sparse_attention

Users that are interested in sparse_attention are comparing it to the libraries listed below

Sorting:

kimiyoung / transformer-xl
☆3,673Updated 3 years ago
lucidrains / reformer-pytorch
Reformer, the efficient Transformer, in Pytorch
☆2,180Updated 2 years ago
idiap / fast-transformers
Pytorch library for fast transformer implementations
☆1,744Updated 2 years ago
facebookresearch / adaptive-span
Transformer training code for sequential tasks
☆610Updated 4 years ago
Smerity / sha-rnn
Single Headed Attention RNN - "Stop thinking with your head"
☆1,182Updated 3 years ago
mit-han-lab / lite-transformer
[ICLR 2020] Lite Transformer with Long-Short Range Attention
☆610Updated last year
lucidrains / performer-pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
☆1,153Updated 3 years ago
huggingface / pytorch-openai-transformer-lm
🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI
☆1,517Updated 4 years ago
epfml / attention-cnn
Source code for "On the Relationship between Self-Attention and Convolutional Layers"
☆1,110Updated 2 years ago
asappresearch / sru
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
☆2,106Updated 3 years ago
tatp22 / linformer-pytorch
My take on a practical implementation of Linformer for Pytorch.
☆420Updated 3 years ago
IBM / pytorch-seq2seq
An open source framework for seq2seq models in PyTorch.
☆1,514Updated last month
google-research / long-range-arena
Long Range Arena for Benchmarking Efficient Transformers
☆765Updated last year
allenai / longformer
Longformer: The Long-Document Transformer
☆2,171Updated 2 years ago
graykode / gpt-2-Pytorch
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
☆1,009Updated 6 years ago
salesforce / awd-lstm-lm
LSTM and QRNN Language Model Toolkit for PyTorch
☆1,981Updated 3 years ago
cybertronai / gradient-checkpointing
Make huge neural nets fit in memory
☆2,814Updated 5 years ago
openai / blocksparse
Efficient GPU kernels for block-sparse matrix multiplication and convolution
☆1,062Updated 2 years ago
facebookresearch / XLM
PyTorch original implementation of Cross-lingual Language Model Pretraining.
☆2,922Updated 2 years ago
lucidrains / linear-attention-transformer
Transformer based on a variant of attention that is linear complexity in respect to sequence length
☆801Updated last year
LiyuanLucasLiu / RAdam
On the Variance of the Adaptive Learning Rate and Beyond
☆2,550Updated 4 years ago
openai / finetune-transformer-lm
Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
☆2,245Updated 6 years ago
majumderb / rezero
Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"
☆415Updated last year
google-research / uda
Unsupervised Data Augmentation (UDA)
☆2,202Updated 4 years ago
tensorflow / mesh
Mesh TensorFlow: Model Parallelism Made Easier
☆1,620Updated last year
salesforce / pytorch-qrnn
PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM
☆1,262Updated 3 years ago
lilianweng / transformer-tensorflow
Implementation of Transformer Model in Tensorflow
☆476Updated 2 years ago
jayparks / transformer
A Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"
☆564Updated 5 years ago
asyml / texar-pytorch
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CAS…
☆747Updated 3 years ago
SamLynnEvans / Transformer
Transformer seq2seq model, program that can build a language translator from parallel corpus
☆1,412Updated 2 years ago