hkproj / kan-notes
☆19Updated last year
Alternatives and similar repositories for kan-notes
Users that are interested in kan-notes are comparing it to the libraries listed below
Sorting:
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆117Updated 11 months ago
- my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architecture☆130Updated last year
- Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.☆373Updated last year
- An easy to use PyTorch implementation of the Kolmogorov Arnold Network and a few novel variations☆180Updated 5 months ago
- Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch 😊☆122Updated last month
- Visualizing some of the internals of a neural network during training and inference.☆75Updated last year
- Gradient Boosting Reinforcement Learning (GBRL)☆108Updated last month
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆102Updated last year
- This repository contains a better implementation of Kolmogorov-Arnold networks☆61Updated last year
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆116Updated 6 months ago
- making the official triton tutorials actually comprehensible☆30Updated last month
- Variations of Kolmogorov-Arnold Networks☆114Updated last year
- ☆31Updated 10 months ago
- An open source implementation of LFMs from Liquid AI: Liquid Foundation Models☆96Updated 7 months ago
- This is the code that went into our practical dive using mamba as information extraction☆54Updated last year
- Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow ✅, Pytorch 🔜, and Jax 🔜)☆53Updated last year
- Kolmogorov-Arnold Networks with various basis functions like B-Splines, Fourier, Chebyshev, Wavelets etc☆34Updated last year
- SaLSa Optimizer implementation (No learning rates needed)☆30Updated this week
- Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"☆553Updated 10 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- Rebuild the Stable Diffusion Model in a single python script. Tutorial for Harvard ML from Scratch Series☆205Updated 3 months ago
- ☆46Updated last month
- Video+code lecture on building nanoGPT from scratch☆67Updated 11 months ago
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆186Updated 11 months ago
- LLMs represent numbers on a helix and manipulate that helix to do addition.☆24Updated 3 months ago
- First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting…☆164Updated last month
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.☆170Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆99Updated 4 months ago
- ☆111Updated 8 months ago
- Documented and Unit Tested educational Deep Learning framework with Autograd from scratch.☆111Updated last year