lucidrains / block-recurrent-transformer-pytorch
Implementation of Block Recurrent Transformer - Pytorch
☆211Updated 3 weeks ago
Related projects: ⓘ
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆391Updated 7 months ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆202Updated 3 weeks ago
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆233Updated 4 months ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆203Updated last year
- Sequence modeling with Mega.☆296Updated last year
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆222Updated last week
- Recurrent Memory Transformer☆148Updated last year
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆278Updated 3 months ago
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"