knotgrass / GriffinLinks

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

☆12

Alternatives and similar repositories for Griffin

Users that are interested in Griffin are comparing it to the libraries listed below

Sorting:

kyegomez / MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆204Updated 3 weeks ago
jacobfa / fft
☆128Updated 3 weeks ago
Indoxer / LKAN
Variations of Kolmogorov-Arnold Networks
☆115Updated last year
akaashdash / kansformers
☆136Updated last year
StarostinV / convkan
Convolutional layer for Kolmogorov-Arnold Network (KAN)
☆106Updated 5 months ago
1ssb / torchkan
An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variations
☆188Updated 9 months ago
kyegomez / Griffin
Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"
☆56Updated 2 weeks ago
andrewgcodes / xlstm
my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architecture
☆131Updated last year
CG80499 / KAN-GPT-2
Training small GPT-2 style models using Kolmogorov-Arnold networks.
☆121Updated last year
kyegomez / MoE-Mamba
Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…
☆110Updated 3 weeks ago
chenziwenhaoshuai / Vision-KAN
KAN for Vision Transformer
☆252Updated 10 months ago
SynodicMonth / ChebyKAN
Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.
☆385Updated last year
Zhangyanbo / MLP-KAN
Kolmogorov–Arnold Networks with modified activation (using MLP to represent the activation)
☆106Updated 10 months ago
quiqi / relu_kan
☆95Updated last year
lucidrains / medical-ai-experiments
A repository to house some personal attempts to beat some state-of-the-art for medical datasets
☆99Updated last year
zavareh1 / Wav-KAN
This repository contains the codes to replicate the simulations from the paper: "Wav-KAN: Wavelet Kolmogorov-Arnold Networks". It showca…
☆148Updated 2 months ago
fkodom / yet-another-retnet
A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…
☆105Updated last year
JaouadT / KANU_Net
U-Net architecture with Kolmogorov-Arnold Convolutions (KA convolutions)
☆42Updated last week
tommyip / mamba2-minimal
Minimal Mamba-2 implementation in PyTorch
☆218Updated last year
GistNoesis / FusedFourierKAN
C++ and Cuda ops for fused FourierKAN
☆80Updated last year
myscience / x-lstm
Pytorch implementation of the xLSTM model by Beck et al. (2024)
☆171Updated last year
kyegomez / Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
☆186Updated 2 weeks ago
jakariaemon / CNN-KAN
A modified CNN architecture using Kolmogorov-Arnold Networks
☆83Updated last year
kyegomez / xLSTM
Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"
☆119Updated 2 weeks ago
TariqAHassan / S4Torch
PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.
☆84Updated last year
NX-AI / vision-lstm
xLSTM as Generic Vision Backbone
☆485Updated 9 months ago
AmeenAli / HiddenMambaAttn
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
☆226Updated last year
jwzhanggy / tinyBIG
tinybig for deep function learning
☆61Updated 2 months ago
mlsquare / xKAN
Kolmogorov-Arnold Networks with various basis functions like B-Splines, Fourier, Chebyshev, Wavelets etc
☆35Updated last year
radarFudan / mamba
☆18Updated 10 months ago