lucidrains/token-shift-gpt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lucidrains/token-shift-gpt)

lucidrains / token-shift-gpt

Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

☆49

Alternatives and similar repositories for token-shift-gpt

Users that are interested in token-shift-gpt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lucidrains / mlp-gpt-jax
View on GitHub
A GPT, made only of MLPs, in Jax
☆59Jun 23, 2021Updated 5 years ago
toshas / torch-discounted-cumsum
View on GitHub
Fast Discounted Cumulative Sums in PyTorch
☆98Aug 28, 2021Updated 4 years ago
jenni-ai / T2FW
View on GitHub
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆20Oct 9, 2022Updated 3 years ago
lucidrains / marge-pytorch
View on GitHub
Implementation of Marge, Pre-training via Paraphrasing, in Pytorch
☆76Jan 14, 2021Updated 5 years ago
lucidrains / esbn-transformer
View on GitHub
An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols
☆16Aug 3, 2021Updated 4 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
lucidrains / triangle-multiplicative-module
View on GitHub
Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, …
☆39Aug 3, 2021Updated 4 years ago
lucidrains / panoptic-transformer
View on GitHub
Another attempt at a long-context / efficient transformer by me
☆38Apr 11, 2022Updated 4 years ago
lucidrains / metaformer-gpt
View on GitHub
Implementation of Metaformer, but in an autoregressive manner
☆26Jun 21, 2022Updated 4 years ago
lucidrains / ponder-transformer
View on GitHub
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper
☆84Oct 30, 2021Updated 4 years ago
lucidrains / g-mlp-gpt
View on GitHub
GPT, but made only out of MLPs
☆89May 25, 2021Updated 5 years ago
lucidrains / multistream-transformers
View on GitHub
Implementation of Multistream Transformers in Pytorch
☆54Jul 31, 2021Updated 4 years ago
AlirezaMorsali / MLP-Attention
View on GitHub
☆17Dec 19, 2024Updated last year
lucidrains / nystrom-attention
View on GitHub
Implementation of Nyström Self-attention, from the paper Nyströmformer
☆145Mar 24, 2025Updated last year
lucidrains / molecule-attention-transformer
View on GitHub
Pytorch reimplementation of Molecule Attention Transformer, which uses a transformer to tackle the graph-like structure of molecules
☆58Dec 2, 2020Updated 5 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
lucidrains / coco-lm-pytorch
View on GitHub
Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch
☆46Mar 3, 2021Updated 5 years ago
j-towns / vdvae-jax
View on GitHub
Very deep VAEs in JAX/Flax
☆47Jun 16, 2021Updated 5 years ago
distdl / distdl
View on GitHub
☆21Aug 18, 2022Updated 3 years ago
lucidrains / logavgexp-torch
View on GitHub
Implementation of LogAvgExp for Pytorch
☆37Apr 10, 2025Updated last year
lucidrains / isab-pytorch
View on GitHub
An implementation of (Induced) Set Attention Block, from the Set Transformers paper
☆70Jun 8, 2026Updated last month
lucidrains / retrieval-augmented-ddpm
View on GitHub
Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch
☆66May 5, 2022Updated 4 years ago
kingoflolz / CLIP_JAX
View on GitHub
Contrastive Language-Image Pretraining
☆147Sep 6, 2022Updated 3 years ago
CHARM-Tx / linear_mem_attention_pytorch
View on GitHub
Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch
☆12Jan 16, 2022Updated 4 years ago
AeroScripts / HiddenEngrams
View on GitHub
Hidden Engrams: Long Term Memory for Transformer Model Inference
☆35Jun 26, 2021Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
lucidrains / En-transformer
View on GitHub
Implementation of E(n)-Transformer, which incorporates attention mechanisms into Welling's E(n)-Equivariant Graph Neural Network
☆226Jun 2, 2024Updated 2 years ago
lucidrains / all-normalization-transformer
View on GitHub
A simple Transformer where the softmax has been replaced with normalization
☆20Sep 11, 2020Updated 5 years ago
antofuller / configaformers
View on GitHub
A python library for highly configurable transformers - easing model architecture search and experimentation.
☆48Nov 30, 2021Updated 4 years ago
cfoster0 / simple-diffusion-model
View on GitHub
Pedagogical codebase for a simplified score-based generative model design, with training loop
☆41Aug 28, 2021Updated 4 years ago
lucidrains / triton-transformer
View on GitHub
Implementation of a Transformer, but completely in Triton
☆279Apr 5, 2022Updated 4 years ago
mcoavoux / mtg
View on GitHub
Statistical discontinuous constituent parsing
☆11Feb 15, 2018Updated 8 years ago
AranKomat / Metroplex
View on GitHub
☆21Mar 15, 2023Updated 3 years ago
lucidrains / memformer
View on GitHub
Implementation of Memformer, a Memory-augmented Transformer, in Pytorch
☆126Nov 13, 2020Updated 5 years ago
lucidrains / pi-GAN-pytorch
View on GitHub
Implementation of π-GAN, for 3d-aware image synthesis, in Pytorch
☆125Feb 22, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lucidrains / long-short-transformer
View on GitHub
Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch
☆120Aug 4, 2021Updated 4 years ago
EleutherAI / pyfra
View on GitHub
Python Research Framework
☆107Nov 3, 2022Updated 3 years ago
lucidrains / NWT-pytorch
View on GitHub
Implementation of NWT, audio-to-video generation, in Pytorch
☆92Mar 17, 2022Updated 4 years ago
lucidrains / rela-transformer
View on GitHub
Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012
☆49Apr 6, 2022Updated 4 years ago
timvieira / vocrf
View on GitHub
Variable-order CRFs with structure learning
☆17Aug 1, 2024Updated last year
lucidrains / n-grammer-pytorch
View on GitHub
Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch
☆81Dec 4, 2022Updated 3 years ago
lucidrains / local-attention-flax
View on GitHub
Local Attention - Flax module for Jax
☆22May 26, 2021Updated 5 years ago