lucidrains / discrete-key-value-bottleneck-pytorchLinks

Implementation of Discrete Key / Value Bottleneck, in Pytorch

☆88

Alternatives and similar repositories for discrete-key-value-bottleneck-pytorch

Users that are interested in discrete-key-value-bottleneck-pytorch are comparing it to the libraries listed below

Sorting:

lucidrains / gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
☆101Updated 2 years ago
lucidrains / Mega-pytorch
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
☆206Updated 2 years ago
lucidrains / compositional-attention-pytorch
Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…
☆51Updated 3 years ago
lucidrains / hourglass-transformer-pytorch
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
☆95Updated 3 years ago
patil-suraj / vit-vqgan
JAX implementation ViT-VQGAN
☆82Updated 3 years ago
lucidrains / mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
☆119Updated last year
lucidrains / mirasol-pytorch
Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch
☆88Updated last year
lucidrains / einops-exts
Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️
☆55Updated 2 years ago
lucidrains / retrieval-augmented-ddpm
Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch
☆65Updated 3 years ago
lsj2408 / URPE
[NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)
☆33Updated 2 years ago
jiaweizzhao / ZerO-initialization
☆75Updated 2 years ago
Newbeeer / stf
Code for ICLR 2023 Paper, "Stable Target Field for Reduced Variance Score Estimation in Diffusion Models”
☆76Updated 2 years ago
LAION-AI / laion50BU
Un-*** 50 billions multimodality dataset
☆22Updated 3 years ago
haoliuhl / language-quantized-autoencoders
Language Quantized AutoEncoders
☆110Updated 2 years ago
lucidrains / MaMMUT-pytorch
Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
☆102Updated 2 years ago
lucidrains / token-shift-gpt
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆50Updated 3 years ago
thjashin / multires-conv
Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)
☆127Updated 2 years ago
facebookresearch / SIMAT
codebase for the SIMAT dataset and evaluation
☆38Updated 3 years ago
lucidrains / autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
☆45Updated 2 years ago
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆99Updated last year
RobertCsordas / linear_layer_as_attention
The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …
☆16Updated 4 months ago
lucidrains / zorro-pytorch
Implementation of Zorro, Masked Multimodal Transformer, in Pytorch
☆96Updated 2 years ago
lucidrains / coordinate-descent-attention
Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk
☆45Updated 2 years ago
lucidrains / product-key-memory
Standalone Product Key Memory module in Pytorch - for augmenting Transformer models
☆83Updated last year
lucidrains / recurrent-interface-network-pytorch
Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in P…
☆204Updated last year
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆113Updated 9 months ago
lucidrains / kalman-filtering-attention
Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"
☆59Updated 2 years ago
microsoft / ResiDual
ResiDual: Transformer with Dual Residual Connections, https://arxiv.org/abs/2304.14802
☆96Updated 2 years ago
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆52Updated 2 years ago
lucidrains / quartic-transformer
Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)
☆55Updated 7 months ago