AdityaNG / kan-gptView external linksLinks
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
☆725Nov 25, 2024Updated last year
Alternatives and similar repositories for kan-gpt
Users that are interested in kan-gpt are comparing it to the libraries listed below
Sorting:
- An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).☆4,582Aug 1, 2024Updated last year
- Kolmogorov Arnold Networks☆16,164Jan 19, 2025Updated last year
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆121May 25, 2024Updated last year
- Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.☆403May 13, 2024Updated last year
- A comprehensive collection of KAN(Kolmogorov-Arnold Network)-related resources, including libraries, projects, tutorials, papers, and mor…☆3,172Dec 14, 2025Updated 2 months ago
- Kolmogorov–Arnold Networks with modified activation (using MLP to represent the activation)☆108Oct 4, 2025Updated 4 months ago
- Benchmark for efficiency in memory and time of different KAN implementations.☆138Aug 26, 2024Updated last year
- Gemma 2B with 10M context length using Infini-attention.☆935May 12, 2024Updated last year
- Variations of Kolmogorov-Arnold Networks☆116May 15, 2024Updated last year
- FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)☆467Jun 20, 2024Updated last year
- ☆748May 24, 2024Updated last year
- Kolmogorov-Arnold Network for Reinforcement Leaning, initial experiments☆296Apr 9, 2025Updated 10 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆987Jul 23, 2024Updated last year
- This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing th…☆914Apr 8, 2025Updated 10 months ago
- KAN for Vision Transformer☆255Oct 7, 2024Updated last year
- PyTorch native post-training library☆5,669Updated this week
- TKAN: Temporal Kolmogorov-Arnold Networks☆225Dec 16, 2024Updated last year
- An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variations☆189Nov 24, 2024Updated last year
- Official repository of Evolutionary Optimization of Model Merging Recipes☆1,395Nov 29, 2024Updated last year
- Tools for merging pretrained large language models.☆6,783Jan 26, 2026Updated 2 weeks ago
- Mamba SSM architecture☆17,186Jan 12, 2026Updated last month
- [ICLR2025] Kolmogorov-Arnold Transformer☆854Mar 23, 2025Updated 10 months ago
- This project is dedicated to the implementation and research of Kolmogorov-Arnold convolutional networks. The repository includes implem…☆531Nov 19, 2024Updated last year
- Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"☆3,334May 4, 2024Updated last year
- Pandora: Towards General World Model with Natural Language Actions and Video States☆533Sep 23, 2024Updated last year
- llama3.np is a pure NumPy implementation for Llama 3 model.☆991Apr 27, 2025Updated 9 months ago
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆10,309Jul 1, 2024Updated last year
- Unofficial PyTorch/🤗Transformers(Gemma/Llama3) implementation of Leave No Context Behind: Efficient Infinite Context Transformers with I…☆374Apr 23, 2024Updated last year
- The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-user…☆1,327Feb 13, 2025Updated last year
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,155Feb 8, 2026Updated last week
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,315Aug 8, 2025Updated 6 months ago
- Linear Attention Sequence Parallelism (LASP)☆88Jun 4, 2024Updated last year
- High order and sparse layers in pytorch. Lagrange Polynomial, Piecewise Lagrange Polynomial, Piecewise Discontinuous Lagrange Polynomial…☆44Jun 24, 2024Updated last year
- Large World Model -- Modeling Text and Video with Millions Context☆7,396Oct 19, 2024Updated last year
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,325Apr 8, 2024Updated last year
- Mora: More like Sora for Generalist Video Generation☆1,584Oct 10, 2024Updated last year
- ☆137Aug 19, 2024Updated last year
- PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily wri…☆1,440Feb 3, 2026Updated last week
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,251Feb 16, 2025Updated 11 months ago