AdityaNG / kan-gptLinks

The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling

☆720

Alternatives and similar repositories for kan-gpt

Users that are interested in kan-gpt are comparing it to the libraries listed below

Sorting:

SynodicMonth / ChebyKAN
Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.
☆380Updated last year
GistNoesis / FourierKAN
☆739Updated last year
microsoft / Samba
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆888Updated 2 months ago
Indoxer / LKAN
Variations of Kolmogorov-Arnold Networks
☆115Updated last year
kyegomez / zeta
Build high-performance AI models with modular building blocks
☆533Updated this week
AntonioTepsich / Convolutional-KANs
This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing th…
☆882Updated 3 months ago
CG80499 / KAN-GPT-2
Training small GPT-2 style models using Kolmogorov-Arnold networks.
☆120Updated last year
ironjr / grokfast
Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
☆555Updated last year
mistralai-sf24 / hackathon
☆447Updated last year
dingo-actual / infini-transformer
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention…
☆290Updated last year
kyegomez / Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
☆172Updated 3 months ago
HazyResearch / m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
☆555Updated 6 months ago
mintisan / awesome-kan
A comprehensive collection of KAN(Kolmogorov-Arnold Network)-related resources, including libraries, projects, tutorials, papers, and mor…
☆2,995Updated 2 weeks ago
redotvideo / mamba-chat
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
☆926Updated last year
SakanaAI / self-adaptive-llms
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
☆1,123Updated 5 months ago
Jamie-Stirling / RetNet
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
☆1,193Updated last year
Blealtan / efficient-kan
An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
☆4,418Updated 11 months ago
google-deepmind / recurrentgemma
Open weights language model from Google DeepMind, based on Griffin.
☆644Updated last month
Liuhong99 / Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
☆962Updated last year
alxndrTL / mamba.py
A simple and efficient Mamba implementation in pure PyTorch and MLX.
☆1,279Updated 7 months ago
srush / annotated-mamba
Annotated version of the Mamba paper
☆486Updated last year
PeaBrane / mamba-tiny
Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).
☆120Updated 8 months ago
likejazz / llama3.np
llama3.np is a pure NumPy implementation for Llama 3 model.
☆986Updated 2 months ago
Jaykef / ai-algorithms
First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…
☆174Updated last week
SakanaAI / evolutionary-model-merge
Official repository of Evolutionary Optimization of Model Merging Recipes
☆1,346Updated 7 months ago
sanderwood / bgpt
Beyond Language Models: Byte Models are Digital World Simulators
☆324Updated last year
Lightning-AI / lightning-thunder
PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily wri…
☆1,375Updated this week
EurekaLabsAI / mlp
The Multilayer Perceptron Language Model
☆553Updated 11 months ago
myshell-ai / JetMoE
Reaching LLaMA2 Performance with 0.1M Dollars
☆984Updated 11 months ago
facebookresearch / schedule_free
Schedule-Free Optimization in PyTorch
☆2,189Updated last month