lucidrains / h-transformer-1dLinks

Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning

☆165

Alternatives and similar repositories for h-transformer-1d

Users that are interested in h-transformer-1d are comparing it to the libraries listed below

Sorting:

lucidrains / nystrom-attention
Implementation of Nyström Self-attention, from the paper Nyströmformer
☆141Updated 7 months ago
lucidrains / axial-positional-embedding
Axial Positional Embedding for Pytorch
☆83Updated 8 months ago
lucidrains / Mega-pytorch
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
☆206Updated 2 years ago
lucidrains / gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
☆101Updated 2 years ago
lucidrains / fast-transformer-pytorch
Implementation of Fast Transformer in Pytorch
☆177Updated 4 years ago
lucidrains / long-short-transformer
Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch
☆120Updated 4 years ago
lucidrains / feedback-transformer-pytorch
Implementation of Feedback Transformer in Pytorch
☆108Updated 4 years ago
ctlllll / SGConv
☆164Updated 2 years ago
wilile26811249 / Fastformer-PyTorch
Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."
☆133Updated 4 years ago
NVIDIA / transformer-ls
Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).
☆228Updated 3 years ago
cpcp1998 / PermuteFormer
Code for the paper PermuteFormer
☆42Updated 4 years ago
aliutkus / spe
Relative Positional Encoding for Transformers with Linear Complexity
☆65Updated 3 years ago
lucidrains / electra-pytorch
A simple and working implementation of Electra, the fastest way to pretrain language models from scratch, in Pytorch
☆235Updated 2 years ago
lucidrains / product-key-memory
Standalone Product Key Memory module in Pytorch - for augmenting Transformer models
☆83Updated last year
lucidrains / routing-transformer
Fully featured implementation of Routing Transformer
☆296Updated 3 years ago
lucidrains / memory-transformer-xl
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
☆49Updated 5 years ago
ischlag / fast-weight-transformers
Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.
☆105Updated 4 years ago
lucidrains / sinkhorn-transformer
Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention
☆268Updated 4 years ago
rishikksh20 / FNet-pytorch
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
☆259Updated 4 years ago
lucidrains / g-mlp-gpt
GPT, but made only out of MLPs
☆89Updated 4 years ago
Fraser-Greenlee / T5-VAE
Checkout the new version at the link!
☆22Updated 4 years ago
lucidrains / memformer
Implementation of Memformer, a Memory-augmented Transformer, in Pytorch
☆123Updated 4 years ago
alex-matton / causal-transformer-decoder
☆72Updated 4 years ago
lucidrains / contrastive-learner
A simple to use pytorch wrapper for contrastive self-supervised learning on any neural network
☆150Updated 4 years ago
pkuzengqi / Skyformer
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)
☆63Updated 3 years ago
lucidrains / linformer
Implementation of Linformer for Pytorch
☆300Updated last year
lucidrains / compressive-transformer-pytorch
Pytorch implementation of Compressive Transformers, from Deepmind
☆162Updated 4 years ago
10-zin / Synthesizer
A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"
☆73Updated 2 years ago
rish-16 / aft-pytorch
Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.
☆243Updated 3 years ago
lucidrains / memory-compressed-attention
Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"
☆69Updated 2 years ago