CG80499 / KAN-GPT-2View external linksLinks
Training small GPT-2 style models using Kolmogorov-Arnold networks.
☆121May 25, 2024Updated last year
Alternatives and similar repositories for KAN-GPT-2
Users that are interested in KAN-GPT-2 are comparing it to the libraries listed below
Sorting:
- Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.☆403May 13, 2024Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated last year
- The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling☆725Nov 25, 2024Updated last year
- ☆21May 24, 2023Updated 2 years ago
- coded with and corrected by Google Anti-Gravity☆13Nov 23, 2025Updated 2 months ago
- Kolmogorov-Arnold Networks with various basis functions like B-Splines, Fourier, Chebyshev, Wavelets etc☆36May 8, 2024Updated last year
- Your favourite classical machine learning algos on the GPU/TPU☆21Dec 14, 2025Updated 2 months ago
- Long Long Term Memory Neural Net Cells☆10Jan 25, 2022Updated 4 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 10 months ago
- Convolutional layer for Kolmogorov-Arnold Network (KAN)☆115Mar 25, 2025Updated 10 months ago
- Expanded KR-BERT for Sentiment Analysis☆13Apr 23, 2021Updated 4 years ago
- ☆10Oct 28, 2024Updated last year
- Implementation for paper Automata Extraction from Transformers.☆11Jun 8, 2024Updated last year
- Easy installer of kocohub dataset☆24May 31, 2020Updated 5 years ago
- FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)☆466Jun 20, 2024Updated last year
- ☆13Jul 19, 2022Updated 3 years ago
- ☆12Jun 2, 2024Updated last year
- This repository contains papers for a comprehensive survey on accelerated generation techniques in Large Language Models (LLMs).☆11May 24, 2024Updated last year
- CUDA implementation of Wavelet KAN.☆16Jun 8, 2024Updated last year
- ☆13Jun 28, 2021Updated 4 years ago
- ☆35Jul 25, 2023Updated 2 years ago
- ☆32May 5, 2024Updated last year
- Bias, Hate classification with KoELECTRA 👿☆27Jun 12, 2023Updated 2 years ago
- ☆35Apr 12, 2024Updated last year
- 한국어 언어 모델 학습을 위한 프로젝트(Flax, Pytorch with Huggingface Accelerate)☆32Sep 13, 2023Updated 2 years ago
- [ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"☆36Mar 3, 2025Updated 11 months ago
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated last year
- Jax like function transformation engine but micro, microjax☆34Oct 25, 2024Updated last year
- This repository contains a better implementation of Kolmogorov-Arnold networks☆63Jun 1, 2025Updated 8 months ago
- KernelBench v2: Can LLMs Write GPU Kernels? - Benchmark with Torch -> Triton (and more!) problems☆21Jul 4, 2025Updated 7 months ago
- JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)☆36Apr 25, 2024Updated last year
- Additional multi-backend functionality for Keras 3.☆16Mar 1, 2024Updated last year
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆19Jun 11, 2025Updated 8 months ago
- ALBERT Text Classification Tensorflow, Resume Classification☆15Mar 28, 2020Updated 5 years ago
- KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch☆15Feb 13, 2022Updated 4 years ago
- Google 공식 Rouge Implementation을 한국어에서 사용할 수 있도록 처리☆18Jan 3, 2024Updated 2 years ago
- The (B)ig (F)unction (T)axonomy is a detailed reference for common compute functions executed by different libraries, databases, and tool…☆18Dec 12, 2024Updated last year
- "Why do I feel offended?" - Korean Dataset for Offensive Language Identification (EACL2023 Findings)☆15May 14, 2023Updated 2 years ago
- Convert Numerical Representations to Korean Pronunciation☆14Apr 20, 2020Updated 5 years ago