hkproj / kan-notesLinks
β19Updated last year
Alternatives and similar repositories for kan-notes
Users that are interested in kan-notes are comparing it to the libraries listed below
Sorting:
- Training small GPT-2 style models using Kolmogorov-Arnold networks.β117Updated last year
- Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch πβ121Updated this week
- Variations of Kolmogorov-Arnold Networksβ114Updated last year
- my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architectureβ129Updated last year
- β36Updated 2 weeks ago
- β46Updated 2 months ago
- Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.β375Updated last year
- Visualizing some of the internals of a neural network during training and inference.β76Updated last year
- Fine tune Gemma 3 on an object detection taskβ46Updated this week
- Collection of tests performed during the study of the new Kolmogorov-Arnold Neural Networks (KAN)β39Updated 3 months ago
- An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variationsβ181Updated 6 months ago
- β130Updated 9 months ago
- Kolmogorov-Arnold Networks with various basis functions like B-Splines, Fourier, Chebyshev, Wavelets etcβ35Updated last year
- SaLSa Optimizer implementation (No learning rates needed)β30Updated 2 weeks ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"β100Updated 5 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuningβ32Updated 2 weeks ago
- Understanding Kolmogorov-Arnold Networks: A Tutorial Series on KAN using Toy Examplesβ191Updated last week
- Pytorch implementation of the xLSTM model by Beck et al. (2024)β165Updated 9 months ago
- Gradient Boosting Reinforcement Learning (GBRL)β110Updated last week
- This is the repository for brain state prediction using fMRI data and transformer.β80Updated 10 months ago
- β31Updated 11 months ago
- KolmogorovβArnold Networks with modified activation (using MLP to represent the activation)β105Updated 7 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIMβ54Updated last year
- The AdEMAMix Optimizer: Better, Faster, Older.β183Updated 8 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.β41Updated last year
- Benchmarking and Testing FastKANβ77Updated last year
- Ο-GPT: A New Approach to Autoregressive Modelsβ65Updated 9 months ago
- working implimention of deepseek MLAβ41Updated 4 months ago
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"β91Updated 3 weeks ago
- A modified CNN architecture using Kolmogorov-Arnold Networksβ80Updated last year