kabachuha / nanoGPKANTLinks

Testing KAN-based text generation GPT models

☆18

Alternatives and similar repositories for nanoGPKANT

Users that are interested in nanoGPKANT are comparing it to the libraries listed below

Sorting:

joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 6 months ago
okarthikb / state-space-models
☆28Updated last year
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
NousResearch / StripedHyenaTrainer
☆62Updated last year
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆107Updated 8 months ago
lucidrains / grokfast-pytorch
Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"
☆103Updated 11 months ago
oKatanaaa / kolibrify
Curriculum training of instruction-following LLMs with Unsloth
☆14Updated 8 months ago
abetlen / program-constrained-language-model-sampling
☆35Updated 2 years ago
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆79Updated 11 months ago
Narsil / hf-chat
☆25Updated 11 months ago
hundredblocks / large-model-parallelism
Functional local implementations of main model parallelism approaches
☆96Updated 2 years ago
SebastianBodza / EnsembleForecasting
Using multiple LLMs for ensemble Forecasting
☆16Updated last year
kyegomez / swarms-pytorch
Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch 😊
☆134Updated last month
idiap / sigma-gpt
σ-GPT: A New Approach to Autoregressive Models
☆69Updated last year
goncalorafaria / qalign
QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.
☆24Updated last week
ChrisHayduk / qlora-multi-gpu
QLoRA with Enhanced Multi GPU Support
☆37Updated 2 years ago
QuixiAI / grokadamw
☆136Updated last year
xjdr-alt / muzero_sketch
☆40Updated last year
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆58Updated last month
knowrohit / know_medical_dialogues
KMD is a collection of conversational exchanges between patients and doctors on various medical topics. It aims to capture the intricaci…
☆24Updated 2 years ago
corl-team / rebased
Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"
☆166Updated 10 months ago
tyler-romero / microR1
Simple repository for training small reasoning models
☆45Updated 9 months ago
yacineMTB / just-large-models
Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.
☆44Updated 2 years ago
AlxSp / t-jepa
☆11Updated last year
evanatyourservice / llm-jax
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆18Updated 3 months ago
Algomancer / The-Daily-Train
Training Models Daily
☆16Updated last year
VatsaDev / NanoPhi-alpha
GPT-2 small trained on phi-like data
☆67Updated last year
euclaise / SlimTrainer
Full finetuning of large language models without large memory requirements
☆94Updated 2 months ago
vikhyat / mixtral-inference
inference code for mixtral-8x7b-32kseqlen
☆102Updated last year
leloykun / modded-nanogpt
NanoGPT (124M) quality in 2.67B tokens
☆28Updated 2 months ago