lucidrains / complex-valued-transformer
Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"
☆57Updated 11 months ago
Related projects: ⓘ
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆74Updated 7 months ago
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆87Updated 3 weeks ago
- Implementation of GateLoop Transformer in Pytorch and Jax☆86Updated 3 months ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆56Updated 10 months ago
- Code for the paper: Complex-Valued Autoencoders for Object Discovery☆47Updated last year
- Explorations into the recently proposed Taylor Series Linear Attention☆85Updated last month
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆119Updated 11 months ago
- A State-Space Model with Rational Transfer Function Representation.☆61Updated 4 months ago
- Sequence Modeling with Structured State Spaces☆60Updated 2 years ago
- Implementation of Agent Attention in Pytorch☆83Updated 2 months ago
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆115Updated last year
- Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)☆58Updated 4 months ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆48Updated last week
- ResiDual: Transformer with Dual Residual Connections, https://arxiv.org/abs/2304.14802☆87Updated last year
- A PyTorch implementation of Bayesian flow networks (Graves et al., 2023).☆21Updated 9 months ago
- Minimal Mamba-2 implementation in PyTorch☆89Updated 3 months ago
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆61Updated 6 months ago
- Implementation of a Light Recurrent Unit in Pytorch☆43Updated 2 weeks ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆87Updated 8 months ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆101Updated last year
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆43Updated last year
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆53Updated 4 months ago
- Implementations of various linear RNN layers using pytorch and triton☆42Updated last year
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆43Updated last week
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆94Updated last month
- ☆19Updated 8 months ago
- Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)☆40Updated 9 months ago
- Recursive Leasting Squares (RLS) with Neural Network for fast learning☆48Updated 10 months ago
- Transformers w/o Attention, based fully on MLPs☆85Updated 5 months ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆94Updated last year