karpathy / nano-llama31Links

nanoGPT style version of Llama 3.1

☆1,432

Alternatives and similar repositories for nano-llama31

Users that are interested in nano-llama31 are comparing it to the libraries listed below

Sorting:

policy-gradient / GRPO-Zero
Implementing DeepSeek R1's GRPO algorithm from scratch
☆1,609Updated 5 months ago
KellerJordan / modded-nanogpt
NanoGPT (124M) in 3 minutes
☆3,176Updated 2 months ago
huggingface / picotron
Minimalistic 4D-parallelism distributed training framework for education purpose
☆1,846Updated last month
EurekaLabsAI / micrograd
The Autograd Engine
☆636Updated last year
EurekaLabsAI / ngram
The n-gram Language Model
☆1,446Updated last year
EurekaLabsAI / mlp
The Multilayer Perceptron Language Model
☆568Updated last year
facebookresearch / MobileLLM
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
☆1,374Updated 5 months ago
huggingface / nanotron
Minimalistic large language model 3D-parallelism training
☆2,252Updated last month
microsoft / Samba
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆916Updated 5 months ago
mlfoundations / dclm
DataComp for Language Models
☆1,371Updated last month
EurekaLabsAI / tensor
The Tensor (or Array)
☆449Updated last year
facebookresearch / blt
Code for BLT research paper
☆1,989Updated 4 months ago
clu0 / unet.cu
UNet diffusion model in pure CUDA
☆649Updated last year
pytorch / torchtitan
A PyTorch native platform for training generative AI models
☆4,525Updated this week
bkitano / llama-from-scratch
Llama from scratch, or How to implement a paper without crying
☆580Updated last year
karpathy / build-nanogpt
Video+code lecture on building nanoGPT from scratch
☆4,423Updated last year
huggingface / search-and-learn
Recipes to scale inference-time compute of open models
☆1,109Updated 4 months ago
AviSoori1x / makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
☆751Updated 11 months ago
pytorch / torchchat
Run PyTorch LLMs locally on servers, desktop and mobile
☆3,611Updated last month
EleutherAI / cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
☆818Updated 2 months ago
SonyResearch / micro_diffusion
Official repository for our work on micro-budget training of large-scale diffusion models.
☆1,514Updated 9 months ago
XueFuzhao / OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
☆1,612Updated last year
McGill-NLP / nano-aha-moment
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
☆536Updated last week
allenai / OLMoE
OLMoE: Open Mixture-of-Experts Language Models
☆878Updated 3 weeks ago
MoonshotAI / Moonlight
Muon is Scalable for LLM Training
☆1,325Updated 2 months ago
natolambert / rlhf-book
Textbook on reinforcement learning from human feedback
☆1,259Updated 2 weeks ago
KellerJordan / Muon
Muon is an optimizer for hidden layers in neural networks
☆1,827Updated 3 months ago
facebookresearch / chameleon
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
☆2,056Updated last year
open-thought / system-2-research
System 2 Reasoning Link Collection
☆856Updated 7 months ago
huggingface / lighteval
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
☆2,009Updated this week