lucidrains / compositional-attention-pytorchLinks

Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process with disentangled search and retrieval head aggregation, in Pytorch

☆51

Alternatives and similar repositories for compositional-attention-pytorch

Users that are interested in compositional-attention-pytorch are comparing it to the libraries listed below

Sorting:

lucidrains / hourglass-transformer-pytorch
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
☆95Updated 3 years ago
lucidrains / remixer-pytorch
Implementation of the Remixer Block from the Remixer paper, in Pytorch
☆36Updated 4 years ago
lucidrains / isab-pytorch
An implementation of (Induced) Set Attention Block, from the Set Transformers paper
☆64Updated 2 years ago
lucidrains / discrete-key-value-bottleneck-pytorch
Implementation of Discrete Key / Value Bottleneck, in Pytorch
☆88Updated 2 years ago
lucidrains / mlp-gpt-jax
A GPT, made only of MLPs, in Jax
☆58Updated 4 years ago
lucidrains / logavgexp-torch
Implementation of LogAvgExp for Pytorch
☆37Updated 6 months ago
lucidrains / metaformer-gpt
Implementation of Metaformer, but in an autoregressive manner
☆26Updated 3 years ago
Newbeeer / Anytime-Auto-Regressive-Model
Code for ICLR 2021 Paper, "Anytime Sampling for Autoregressive Models via Ordered Autoencoding"
☆26Updated 2 years ago
lucidrains / local-attention-flax
Local Attention - Flax module for Jax
☆22Updated 4 years ago
lucidrains / kronecker-attention-pytorch
Implementation of Kronecker Attention in Pytorch
☆19Updated 5 years ago
Zasder3 / open_clip_juwels
An open source implementation of CLIP.
☆33Updated 2 years ago
lucidrains / einops-exts
Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️
☆55Updated 2 years ago
lucidrains / gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
☆101Updated 2 years ago
lucidrains / multistream-transformers
Implementation of Multistream Transformers in Pytorch
☆54Updated 4 years ago
lucidrains / panoptic-transformer
Another attempt at a long-context / efficient transformer by me
☆38Updated 3 years ago
lucidrains / deep-linear-network
A simple implementation of a deep linear Pytorch module
☆21Updated 5 years ago
lucidrains / rela-transformer
Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012
☆49Updated 3 years ago
ColinQiyangLi / AdaCat
AdaCat
☆49Updated 3 years ago
gisilvs / AEF
☆33Updated 2 years ago
lucidrains / cross-transformers-pytorch
Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch
☆54Updated 4 years ago
lucidrains / token-shift-gpt
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆50Updated 3 years ago
lucidrains / long-short-transformer
Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch
☆120Updated 4 years ago
AranKomat / Metroplex
☆21Updated 2 years ago
NVlabs / VAEBM
The Official PyTorch Implementation of "VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models" (ICLR 2021 spotlight…
☆57Updated 3 years ago
jiaweizzhao / ZerO-initialization
☆75Updated 2 years ago
rwightman / imagenet-12k
ImageNet-12k subset of ImageNet-21k (fall11)
☆21Updated 2 years ago
dlmacedo / distinction-maximization-loss
A project to improve out-of-distribution detection (open set recognition) and uncertainty estimation by changing a few lines of code in y…
☆44Updated 3 years ago
tmabraham / Trans-CycleGAN
A convolution-free, transformer-only version of the CycleGAN framework
☆33Updated 3 years ago
ClashLuke / PerfTorch
High performance pytorch modules
☆18Updated 2 years ago
SamsungSAILMontreal / PAPA
Repository for the PopulAtion Parameter Averaging (PAPA) paper
☆27Updated last year