prateekstark / retnetLinks
☆14Updated 2 years ago
Alternatives and similar repositories for retnet
Users that are interested in retnet are comparing it to the libraries listed below
Sorting:
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆56Updated last month
- an implementation of paper"Retentive Network: A Successor to Transformer for Large Language Models" https://arxiv.org/pdf/2307.08621.pdf☆12Updated 2 years ago
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆15Updated 9 months ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆65Updated last year
- ☆23Updated last year
- ☆16Updated 2 years ago
- State Space Models☆70Updated last year
- Implementation of Agent Attention in Pytorch☆92Updated last year
- Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)☆85Updated 3 years ago
- A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…☆106Updated last year
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆118Updated this week
- ☆27Updated last year
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆87Updated last year
- PyTorch implementation of Retentive Network: A Successor to Transformer for Large Language Models☆14Updated 2 years ago
- Spectral Attention Autoregressive Model (SAAM)☆16Updated 2 years ago
- ☆16Updated 9 months ago
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆43Updated 9 months ago
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆57Updated last year
- ☆19Updated 2 years ago
- Cuda implementation of Extended Long Short Term Memory (xLSTM) with C++ and PyTorch ports☆89Updated last year
- 一些RNN的实现☆51Updated 2 years ago
- [ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation☆46Updated 7 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated last year
- MNIST example using Kolmogorov-Arnold Networks☆28Updated last year
- A easy to use implementation of xLSTM☆59Updated last month
- Source code for Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers☆17Updated last year
- The official Pytorch implementation of the paper "Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT …☆40Updated last year
- ☆47Updated 3 months ago
- Implementation of GateLoop Transformer in Pytorch and Jax☆90Updated last year
- Official repository of "Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models" [ICML 2023]☆22Updated 9 months ago