lucidrains / perceiver-ar-pytorchLinks

Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch

☆89

Alternatives and similar repositories for perceiver-ar-pytorch

Users that are interested in perceiver-ar-pytorch are comparing it to the libraries listed below

Sorting:

lucidrains / Mega-pytorch
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
☆204Updated last year
lucidrains / gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
☆101Updated 2 years ago
facebookresearch / flashy
Framework for writing deep learning training loops. Lightweight, and retaining full freedom to design as you see fits. It handles checkpo…
☆115Updated last year
Jack000 / DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
☆89Updated 3 years ago
lucidrains / rvq-vae-gpt
My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation
☆88Updated 9 months ago
lucidrains / kalman-filtering-attention
Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"
☆58Updated last year
lucidrains / gradnorm-pytorch
A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch
☆100Updated last year
crowsonkb / LDLM
Latent Diffusion Language Models
☆68Updated last year
lucidrains / insertion-deletion-ddpm
Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models
☆30Updated 3 years ago
lucidrains / gateloop-transformer
Implementation of GateLoop Transformer in Pytorch and Jax
☆89Updated last year
lucidrains / hourglass-transformer-pytorch
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
☆91Updated 3 years ago
lucidrains / ponder-transformer
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper
☆81Updated 3 years ago
lucidrains / self-reasoning-tokens-pytorch
Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto
☆56Updated last year
google-research / perceiver-ar
☆241Updated last month
lucidrains / RQ-Transformer
Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Residual Quantization"
☆112Updated 3 years ago
ColinQiyangLi / AdaCat
AdaCat
☆49Updated 3 years ago
microsoft / ResiDual
ResiDual: Transformer with Dual Residual Connections, https://arxiv.org/abs/2304.14802
☆95Updated last year
lucidrains / compositional-attention-pytorch
Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…
☆51Updated 3 years ago
lucidrains / GAF-microbatch-pytorch
Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch
☆25Updated 6 months ago
crowsonkb / consistency-models
A JAX implementation of the continuous time formulation of Consistency Models
☆85Updated 2 years ago
aliutkus / spe
Relative Positional Encoding for Transformers with Linear Complexity
☆64Updated 3 years ago
lucidrains / quartic-transformer
Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)
☆52Updated 4 months ago
lucidrains / adam-atan2-pytorch
Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch
☆112Updated 8 months ago
lucidrains / discrete-key-value-bottleneck-pytorch
Implementation of Discrete Key / Value Bottleneck, in Pytorch
☆88Updated 2 years ago
lucidrains / mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
☆120Updated 9 months ago
lucidrains / pytorch-custom-utils
Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…
☆124Updated last year
infocusp / diffusion_models
Minimal standalone example of diffusion model
☆159Updated 3 years ago
lucidrains / transframer-pytorch
Implementation of Transframer, Deepmind's U-net + Transformer architecture for up to 30 seconds video generation, in Pytorch
☆72Updated 2 years ago
gcambara / cape
Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch
☆41Updated 2 years ago
ctlllll / SGConv
☆163Updated 2 years ago