claCase / Attention-as-RNN
Non-official implementation of "Attention as an RNN" from https://arxiv.org/pdf/2405.13956, efficient associative parallel prefix scan and recurrent version implemented.
☆20Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for Attention-as-RNN
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆106Updated last week
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆50Updated last week
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆21Updated last week
- ☆103Updated this week
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆41Updated 11 months ago
- Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…☆84Updated last week
- This code implements a Radial Basis Function (RBF) based Kolmogorov-Arnold Network (KAN) for function approximation.☆25Updated 5 months ago
- RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks☆78Updated 3 months ago
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…☆55Updated last week
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆70Updated 8 months ago
- C++ and Cuda ops for fused FourierKAN☆73Updated 6 months ago
- ☆39Updated 5 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆137Updated last week
- Explorations into improving ViTArc with Slot Attention☆37Updated last month
- my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architecture☆130Updated 6 months ago
- State Space Models☆63Updated 6 months ago
- Pytorch implementation of the xLSTM model by Beck et al. (2024)☆141Updated 3 months ago
- ☆77Updated 5 months ago
- Benchmarking and Testing FastKAN☆65Updated 5 months ago
- tinybig for deep function learning☆36Updated this week
- Awesome list of papers that extend Mamba to various applications.☆128Updated 2 months ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆42Updated 4 months ago
- Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models☆12Updated 8 months ago
- ☆14Updated 7 months ago
- Kolmogorov-Arnold Networks (KAN) using Jacobi polynomials instead of B-splines.☆32Updated 6 months ago
- Benchmark for efficiency in memory and time of different KAN implementations.☆111Updated 2 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆50Updated 7 months ago
- ☆119Updated 6 months ago
- A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much muc…☆146Updated last week
- A pytorch implementation of Fourier Analysis Networks (FAN)☆11Updated last month