evintunador / templateGPTLinks

customizable template GPT code designed for easy novel architecture experimentation

☆26

Alternatives and similar repositories for templateGPT

Users that are interested in templateGPT are comparing it to the libraries listed below

Sorting:

casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆172Updated 8 months ago
QuixiAI / grokadamw
☆135Updated last year
Pints-AI / 1.5-Pints
A compact LLM pretrained in 9 days by using high quality data
☆328Updated 5 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆105Updated 7 months ago
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆231Updated 11 months ago
brendanhogan / DeepSeekRL-Extended
Exploring Applications of GRPO
☆248Updated last month
PrimeIntellect-ai / prime-environments
Training-Ready RL Environments + Evals
☆116Updated this week
jerber / lang-jepa
☆124Updated 9 months ago
facebookresearch / memory
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…
☆342Updated 9 months ago
SakanaAI / evo-memory
Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
☆322Updated 11 months ago
PrimeIntellect-ai / genesys
☆135Updated 6 months ago
NousResearch / atropos
Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …
☆702Updated this week
PrimeIntellect-ai / prime-rl
Async RL Training at Scale
☆669Updated this week
arcprize / ARC-AGI-3-Agents
☆86Updated last week
tokenbender / avataRL
rl from zero pretrain, can it be done? yes.
☆275Updated last week
SinatrasC / entropix-smollm
smolLM with Entropix sampler on pytorch
☆150Updated 11 months ago
wolfecameron / nanoMoE
An extension of the nanoGPT repository for training small MOE models.
☆196Updated 6 months ago
huggingface / gpt-oss-recipes
Collection of scripts and notebooks for OpenAI's latest GPT OSS models
☆456Updated last month
arcee-ai / EvolKit
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…
☆240Updated 11 months ago
NousResearch / Open-Reasoning-Tasks
A comprehensive repository of reasoning tasks for LLMs (and beyond)
☆449Updated last year
QuixiAI / spectrum
☆136Updated last month
joey00072 / Multi-Head-Latent-Attention-MLA-
working implimention of deepseek MLA
☆44Updated 9 months ago
doomslide / hyperobject
Plotting (entropy, varentropy) for small LMs
☆98Updated 4 months ago
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆201Updated last year
keeeeenw / MicroLlama
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
☆161Updated last month
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆145Updated 7 months ago
ibm-granite / granite-3.0-language-models
☆264Updated 3 months ago
arcee-ai / DistillKit
An Open Source Toolkit For LLM Distillation
☆732Updated 3 months ago
NVIDIA / ngpt
Normalized Transformer (nGPT)
☆190Updated 10 months ago
willccbb / mlx_parallm
Fast parallel LLM inference for MLX
☆220Updated last year