thomasahle / kanmlpsLinks
KANs and MLPs
β11Updated last year
Alternatives and similar repositories for kanmlps
Users that are interested in kanmlps are comparing it to the libraries listed below
Sorting:
- π§± Modula software packageβ204Updated 3 months ago
- β98Updated 5 months ago
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.β50Updated last year
- β197Updated 7 months ago
- Accelerated First Order Parallel Associative Scanβ182Updated 10 months ago
- β26Updated 2 weeks ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAXβ84Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β147Updated 2 weeks ago
- Evaluating the Mamba architecture on the Othello gameβ47Updated last year
- β40Updated last year
- β53Updated 9 months ago
- β17Updated 10 months ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditionβ¦β179Updated last month
- DeMo: Decoupled Momentum Optimizationβ189Updated 7 months ago
- Griffin MQA + Hawk Linear RNN Hybridβ87Updated last year
- supporting pytorch FSDP for optimizersβ82Updated 7 months ago
- A repo based on XiLin Li's PSGD repo that extends some of the experiments.β14Updated 9 months ago
- β29Updated 3 months ago
- β32Updated 9 months ago
- Explorations into whether a transformer with RL can direct a genetic algorithm to converge fasterβ70Updated last month
- β53Updated last year
- A MAD laboratory to improve AI architecture designs π§ͺβ123Updated 7 months ago
- Experiment of using Tangent to autodiff tritonβ79Updated last year
- β53Updated last year
- A system for automating selection and optimization of pre-trained models from the TAO Model Zooβ25Updated last year
- Parallelizing non-linear sequential models over the sequence lengthβ52Updated 3 weeks ago
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)β94Updated 7 months ago
- Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.β¦β97Updated 6 months ago
- Understand and test language model architectures on synthetic tasks.β219Updated last month
- Normalized Transformer (nGPT)β184Updated 7 months ago