peytontolbert / Griffin
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
☆13Updated last year
Alternatives and similar repositories for Griffin:
Users that are interested in Griffin are comparing it to the libraries listed below
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆52Updated 3 weeks ago
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 8 months ago
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆20Updated this week
- ☆23Updated 7 months ago
- ☆31Updated 6 months ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆66Updated 9 months ago
- Kolmogorov–Arnold Networks with modified activation (using MLP to represent the activation)☆103Updated 5 months ago
- Toy genetic algorithm in Pytorch☆39Updated this week
- Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"☆119Updated 3 weeks ago
- This is a simple torch implementation of the high performance Multi-Query Attention☆16Updated last year
- ☆40Updated 3 months ago
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆80Updated last year
- Explorations into the recently proposed Taylor Series Linear Attention☆98Updated 8 months ago
- ☆30Updated 5 months ago
- Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or otherwise☆34Updated 7 months ago
- Minimal Implementation of Visual Autoregressive Modelling (VAR)☆30Updated last month
- Kolmogorov-Arnold Networks (KAN) using Jacobi polynomials instead of B-splines.☆38Updated 11 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆37Updated last year
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆26Updated 2 weeks ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆64Updated last year
- We study toy models of skill learning.☆25Updated 3 months ago
- Explorations into improving ViTArc with Slot Attention☆40Updated 6 months ago
- Griffin MQA + Hawk Linear RNN Hybrid☆85Updated last year
- This code implements a Radial Basis Function (RBF) based Kolmogorov-Arnold Network (KAN) for function approximation.☆28Updated 10 months ago
- Remasking Discrete Diffusion Models with Inference-Time Scaling☆18Updated last month
- ☆27Updated 9 months ago
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆16Updated last year
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- Exploration into the Scaling Value Iteration Networks paper, from Schmidhuber's group☆36Updated 7 months ago
- ☆31Updated 11 months ago