kabachuha / nanoGPKANTLinks
Testing KAN-based text generation GPT models
☆18Updated last year
Alternatives and similar repositories for nanoGPKANT
Users that are interested in nanoGPKANT are comparing it to the libraries listed below
Sorting:
- Collection of autoregressive model implementation☆85Updated 8 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆109Updated 9 months ago
- Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"☆166Updated 11 months ago
- ☆62Updated 2 years ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated last year
- Training Models Daily☆17Updated 2 years ago
- Curriculum training of instruction-following LLMs with Unsloth☆14Updated 2 weeks ago
- Set of scripts to finetune LLMs☆38Updated last year
- ☆12Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆103Updated last year
- Tokun to can tokens☆18Updated 6 months ago
- ☆105Updated 11 months ago
- ☆137Updated last year
- KMD is a collection of conversational exchanges between patients and doctors on various medical topics. It aims to capture the intricaci…☆24Updated 2 years ago
- QLoRA with Enhanced Multi GPU Support☆37Updated 2 years ago
- new optimizer☆20Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Simple GRPO scripts and configurations.☆59Updated 10 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- Implementation of the Mamba SSM with hf_integration.☆56Updated last year
- Full finetuning of large language models without large memory requirements☆94Updated 3 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 8 months ago
- ☆39Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆66Updated last month
- Code base for internal reward models and PPO training☆24Updated 2 years ago
- gzip Predicts Data-dependent Scaling Laws☆34Updated last year
- ☆45Updated 2 years ago
- An introduction to LLM Sampling☆79Updated last year
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆61Updated last year