AdityaNG / kan-gptLinks
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
☆718Updated 7 months ago
Alternatives and similar repositories for kan-gpt
Users that are interested in kan-gpt are comparing it to the libraries listed below
Sorting:
- Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.☆380Updated last year
- ☆738Updated last year
- This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing th…☆876Updated 2 months ago
- An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).☆4,388Updated 10 months ago
- FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)☆418Updated last year
- An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variations☆183Updated 7 months ago
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆118Updated last year
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆883Updated last month
- A comprehensive collection of KAN(Kolmogorov-Arnold Network)-related resources, including libraries, projects, tutorials, papers, and mor…☆2,973Updated 4 months ago
- Variations of Kolmogorov-Arnold Networks☆115Updated last year
- The Multilayer Perceptron Language Model☆554Updated 10 months ago
- PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention…☆289Updated last year
- Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"☆555Updated 11 months ago
- KAN for Vision Transformer☆248Updated 8 months ago
- Schedule-Free Optimization in PyTorch☆2,180Updated last month
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆554Updated 5 months ago
- Open weights language model from Google DeepMind, based on Griffin.☆641Updated 3 weeks ago
- nanoGPT style version of Llama 3.1☆1,386Updated 10 months ago
- Annotated version of the Mamba paper☆485Updated last year
- A simple and efficient Mamba implementation in pure PyTorch and MLX.☆1,261Updated 6 months ago
- Mamba-Chat: A chat LLM based on the state-space model architecture 🐍☆925Updated last year
- Implementation of Diffusion Transformer (DiT) in JAX☆278Updated last year
- Build high-performance AI models with modular building blocks☆528Updated 2 weeks ago
- Thunder gives you PyTorch models superpowers for training and inference. Unlock out-of-the-box optimizations for performance, memory and …☆1,367Updated this week
- Understanding Kolmogorov-Arnold Networks: A Tutorial Series on KAN using Toy Examples☆191Updated 3 weeks ago
- "Deep Dive into AI with MLX and PyTorch" is an educational initiative designed to help anyone interested in AI, specifically in machine l…☆476Updated last month
- Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆411Updated 10 months ago
- NanoGPT (124M) in 3 minutes☆2,699Updated this week
- Benchmark for efficiency in memory and time of different KAN implementations.☆126Updated 9 months ago
- A JAX research toolkit for building, editing, and visualizing neural networks.☆1,790Updated last month