lucidrains / complex-valued-transformer
Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"
☆66Updated last year
Alternatives and similar repositories for complex-valued-transformer:
Users that are interested in complex-valued-transformer are comparing it to the libraries listed below
- A State-Space Model with Rational Transfer Function Representation.☆76Updated 8 months ago
- Implementation of GateLoop Transformer in Pytorch and Jax☆87Updated 7 months ago
- Deep Learning Model for Signal Data☆85Updated 5 years ago
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆82Updated 11 months ago
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆98Updated last month
- ☆54Updated last year
- Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule☆76Updated 2 weeks ago
- Code for the paper: Complex-Valued Autoencoders for Object Discovery☆48Updated last year
- ☆22Updated last month
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆33Updated 2 months ago
- Sequence Modeling with Structured State Spaces☆61Updated 2 years ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆121Updated last year
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆60Updated last month
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆115Updated 2 years ago
- ☆22Updated 2 months ago
- Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)☆72Updated 8 months ago
- Implementation of a Light Recurrent Unit in Pytorch☆47Updated 3 months ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆112Updated 3 months ago
- Complex tensor and complex functions for pytorch.☆48Updated 2 years ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆58Updated 3 months ago
- Implementations of various linear RNN layers using pytorch and triton☆49Updated last year
- ☆163Updated last year
- Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)☆42Updated last year
- Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)☆44Updated 3 months ago
- Implementation of Agent Attention in Pytorch☆89Updated 6 months ago
- Transformers w/o Attention, based fully on MLPs☆91Updated 9 months ago
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆73Updated 10 months ago
- Visualizing representations with diffusion based conditional generative model.☆87Updated last year
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆62Updated 2 weeks ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆44Updated last year