AdityaNG / kan-gpt
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
☆702Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for kan-gpt
- Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.☆347Updated 5 months ago
- ☆709Updated 5 months ago
- NanoGPT (124M) quality in 8.2 minutes☆946Updated this week
- A comprehensive collection of KAN(Kolmogorov-Arnold Network)-related resources, including libraries, projects, tutorials, papers, and mor…☆2,529Updated this week
- This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing th…☆767Updated this week
- Official repository of the xLSTM.☆1,363Updated this week
- Best practices & guides on how to write distributed pytorch training code☆278Updated this week
- FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)☆365Updated 4 months ago
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆801Updated 2 months ago
- An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).☆4,065Updated 3 months ago
- An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variations☆158Updated 2 months ago
- nanoGPT style version of Llama 3.1☆1,231Updated 3 months ago
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆108Updated 5 months ago
- UNet diffusion model in pure CUDA☆567Updated 4 months ago
- ☆448Updated 7 months ago
- System 2 Reasoning Link Collection☆683Updated last week
- The Multilayer Perceptron Language Model☆521Updated 3 months ago
- The best repository showing why transformers might not be the answer for time series forecasting and showcasing the best SOTA non transfo…☆516Updated this week
- Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"☆513Updated 4 months ago
- Schedule-Free Optimization in PyTorch☆1,880Updated this week
- A modern model graph visualizer and debugger☆1,045Updated this week
- A native PyTorch Library for large model training☆2,579Updated this week
- Open weights language model from Google DeepMind, based on Griffin.☆606Updated 4 months ago
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆860Updated last month
- Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793☆322Updated last week
- Understanding Kolmogorov-Arnold Networks: A Tutorial Series on KAN using Toy Examples☆162Updated 3 weeks ago
- Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.☆687Updated 2 months ago
- Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw☆291Updated 3 months ago
- Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors a…☆1,187Updated this week