a-r-r-o-w / kanformerLinks

Naively combining transformers and Kolmogorov-Arnold Networks to learn and experiment

☆35

Alternatives and similar repositories for kanformer

Users that are interested in kanformer are comparing it to the libraries listed below

Sorting:

kyegomez / Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
☆171Updated 2 months ago
tanaymeh / mamba-train
A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM
☆55Updated last year
jakariaemon / CNN-KAN
A modified CNN architecture using Kolmogorov-Arnold Networks
☆80Updated last year
akaashdash / kansformers
☆133Updated last year
chenziwenhaoshuai / Vision-KAN
KAN for Vision Transformer
☆248Updated 8 months ago
Jerry-Master / KAN-benchmarking
Benchmark for efficiency in memory and time of different KAN implementations.
☆126Updated 10 months ago
kyegomez / Griffin
Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"
☆55Updated 2 months ago
kyegomez / MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆196Updated 2 months ago
Jaykef / ai-algorithms
First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…
☆170Updated this week
andrewgcodes / xlstm
my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architecture
☆130Updated last year
myscience / mamba
Pytorch (Lightning) implementation of the Mamba model
☆29Updated 2 months ago
MrPio / KAN-Continual_Learning_tests
Collection of tests performed during the study of the new Kolmogorov-Arnold Neural Networks (KAN)
☆39Updated 4 months ago
AthanasiosDelis / faster-kan
Benchmarking and Testing FastKAN
☆78Updated last year
radarFudan / mamba
☆18Updated 8 months ago
Zyphra / BlackMamba
Code repository for Black Mamba
☆247Updated last year
1ssb / torchkan
An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variations
☆183Updated 7 months ago
quiqi / relu_kan
☆92Updated last year
AmeenAli / HiddenMambaAttn
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
☆223Updated last year
kyegomez / xLSTM
Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"
☆119Updated 2 months ago
SynodicMonth / ChebyKAN
Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.
☆380Updated last year
smvorwerk / xlstm-cuda
Cuda implementation of Extended Long Short Term Memory (xLSTM) with C++ and PyTorch ports
☆87Updated last year
hkproj / multi-latent-attention
☆39Updated last month
apapiu / mamba_small_bench
Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)
☆46Updated last year
Indoxer / LKAN
Variations of Kolmogorov-Arnold Networks
☆115Updated last year
sidhu2690 / Deep-KAN
This repository contains a better implementation of Kolmogorov-Arnold networks
☆62Updated 3 weeks ago
fkodom / yet-another-retnet
A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…
☆106Updated last year
Zhangyanbo / MLP-KAN
Kolmogorov–Arnold Networks with modified activation (using MLP to represent the activation)
☆105Updated 7 months ago
hkproj / mamba-notes
Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)
☆169Updated last year
pengzhangzhi / Awesome-Mamba
Awesome list of papers that extend Mamba to various applications.
☆133Updated 2 weeks ago
Adamdad / rational_kat_cu
☆63Updated 4 months ago