Training small GPT-2 style models using Kolmogorov-Arnold networks.
☆121May 25, 2024Updated last year
Alternatives and similar repositories for KAN-GPT-2
Users that are interested in KAN-GPT-2 are comparing it to the libraries listed below
Sorting:
- Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.☆404May 13, 2024Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated last year
- The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling☆725Nov 25, 2024Updated last year
- ☆21May 24, 2023Updated 2 years ago
- coded with and corrected by Google Anti-Gravity☆13Nov 23, 2025Updated 3 months ago
- Kolmogorov-Arnold Networks with various basis functions like B-Splines, Fourier, Chebyshev, Wavelets etc☆36May 8, 2024Updated last year
- Your favourite classical machine learning algos on the GPU/TPU☆22Dec 14, 2025Updated 2 months ago
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 11 months ago
- High order and sparse layers in pytorch. Lagrange Polynomial, Piecewise Lagrange Polynomial, Piecewise Discontinuous Lagrange Polynomial…☆44Jun 24, 2024Updated last year
- Implementation for paper Automata Extraction from Transformers.☆12Jun 8, 2024Updated last year
- ☆10Oct 28, 2024Updated last year
- Easy installer of kocohub dataset☆24May 31, 2020Updated 5 years ago
- FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)☆468Jun 20, 2024Updated last year
- CUDA implementation of Wavelet KAN.☆16Jun 8, 2024Updated last year
- This repository contains papers for a comprehensive survey on accelerated generation techniques in Large Language Models (LLMs).☆11May 24, 2024Updated last year
- ☆13Jun 2, 2024Updated last year
- ☆13Jul 19, 2022Updated 3 years ago
- ☆32May 5, 2024Updated last year
- ☆35Jul 25, 2023Updated 2 years ago
- Bias, Hate classification with KoELECTRA 👿☆27Jun 12, 2023Updated 2 years ago
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated last year
- [ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"☆37Mar 3, 2025Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- 한국어 언어 모델 학습을 위한 프로젝트(Flax, Pytorch with Huggingface Accelerate)☆32Sep 13, 2023Updated 2 years ago
- Created Francisco Angulo de Lafuente ⚡️Deploy the DEMO⬇️☆20Jan 1, 2025Updated last year
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆22Jul 4, 2025Updated 8 months ago
- JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)☆35Apr 25, 2024Updated last year
- Additional multi-backend functionality for Keras 3.☆16Mar 1, 2024Updated 2 years ago
- ALBERT Text Classification Tensorflow, Resume Classification☆15Mar 28, 2020Updated 5 years ago
- The (B)ig (F)unction (T)axonomy is a detailed reference for common compute functions executed by different libraries, databases, and tool…☆18Dec 12, 2024Updated last year
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆19Jun 11, 2025Updated 8 months ago
- KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch☆15Feb 13, 2022Updated 4 years ago
- "Why do I feel offended?" - Korean Dataset for Offensive Language Identification (EACL2023 Findings)☆15May 14, 2023Updated 2 years ago
- An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variations☆189Nov 24, 2024Updated last year
- ☆140May 8, 2024Updated last year
- ☆23Mar 7, 2025Updated last year
- Fine-Tune LLM Synthetic-Data application and "From Data to AGI: Unlocking the Secrets of Large Language Model"☆16Jul 5, 2024Updated last year
- ☆79Feb 4, 2025Updated last year
- Troll Detector☆15Nov 28, 2022Updated 3 years ago