alex-matton / causal-transformer-decoderLinks

☆73

Alternatives and similar repositories for causal-transformer-decoder

Users that are interested in causal-transformer-decoder are comparing it to the libraries listed below

Sorting:

aliutkus / spe
Relative Positional Encoding for Transformers with Linear Complexity
☆65Updated 3 years ago
lucidrains / memformer
Implementation of Memformer, a Memory-augmented Transformer, in Pytorch
☆126Updated 5 years ago
lucidrains / gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
☆102Updated 2 years ago
lucidrains / Mega-pytorch
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
☆207Updated 2 years ago
lucidrains / long-short-transformer
Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch
☆120Updated 4 years ago
lucidrains / memory-transformer-xl
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
☆49Updated 5 years ago
lucidrains / compressive-transformer-pytorch
Pytorch implementation of Compressive Transformers, from Deepmind
☆163Updated 4 years ago
ctlllll / SGConv
☆163Updated 3 years ago
lucidrains / axial-positional-embedding
Axial Positional Embedding for Pytorch
☆84Updated 11 months ago
NVIDIA / transformer-ls
Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).
☆228Updated 3 years ago
yaohungt / TransformerDissection
[EMNLP'19] Summary for Transformer Understanding
☆53Updated 6 years ago
seongminp / transformers-into-vaes
Code for "Finetuning Pretrained Transformers into Variational Autoencoders"
☆40Updated 3 years ago
lucidrains / n-grammer-pytorch
Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch
☆76Updated 3 years ago
microsoft / ResiDual
ResiDual: Transformer with Dual Residual Connections, https://arxiv.org/abs/2304.14802
☆97Updated 2 years ago
rosinality / imputer-pytorch
Implementation of Imputer: Sequence Modelling via Imputation and Dynamic Programming in PyTorch
☆58Updated 5 years ago
lucidrains / fast-transformer-pytorch
Implementation of Fast Transformer in Pytorch
☆176Updated 4 years ago
ag1988 / dss
Sequence Modeling with Structured State Spaces
☆67Updated 3 years ago
cfoster0 / CLAP
Contrastive Language-Audio Pretraining
☆88Updated 3 years ago
lucidrains / product-key-memory
Standalone Product Key Memory module in Pytorch - for augmenting Transformer models
☆87Updated 3 months ago
distsup / DistSup
Representation learning for NLP @ JSALT19
☆40Updated 5 years ago
facebookresearch / mega
Sequence modeling with Mega.
☆303Updated 3 years ago
lucidrains / ponder-transformer
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper
☆82Updated 4 years ago
cpcp1998 / PermuteFormer
Code for the paper PermuteFormer
☆42Updated 4 years ago
arxyzan / data2vec-pytorch
PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI
☆184Updated 2 years ago
lucidrains / insertion-deletion-ddpm
Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models
☆30Updated 3 years ago
GoGoDuck912 / pytorch-vector-quantization
A Pytorch Implementations for Various Vector Quantization Methods
☆33Updated 4 years ago
lucidrains / g-mlp-gpt
GPT, but made only out of MLPs
☆89Updated 4 years ago
lucidrains / charformer-pytorch
Implementation of the GBST block from the Charformer paper, in Pytorch
☆118Updated 4 years ago
lsj2408 / URPE
[NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)
☆34Updated 2 years ago
janfreyberg / pytorch-revgrad
A minimal pytorch package implementing a gradient reversal layer.
☆158Updated last year