Laz4rz / GPT-2Links

Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish

☆172

Alternatives and similar repositories for GPT-2

Users that are interested in GPT-2 are comparing it to the libraries listed below

Sorting:

smolorg / smolgrad
small auto-grad engine inspired from Karpathy's micrograd and PyTorch
☆278Updated 9 months ago
xjdr-alt / simple_transformer
Simple Transformer in Jax
☆139Updated last year
arpitingle / gpu-alpha
High Quality Resources on GPU Programming/Architecture
☆588Updated last year
joey00072 / Tinytorch
A really tiny autograd engine
☆95Updated 2 months ago
smolorg / smolvecstore
a tiny vectorstore implementation built with numpy.
☆63Updated last year
naklecha / llm-inference-optimizations-explained
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆64Updated 3 months ago
Maharshi-Pandya / cudacodes
Learnings and programs related to CUDA
☆415Updated last month
Quentin-Anthony / torch-profiling-tutorial
☆464Updated 2 weeks ago
0xD4rky / Vision-Transformers
This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…
☆226Updated 7 months ago
EurekaLabsAI / tensor
The Tensor (or Array)
☆442Updated last year
spikedoanz / from-bits-to-intelligence
could we make an ml stack in 100,000 lines of code?
☆46Updated last year
MarioSieg / magnetron
(WIP) A small but powerful, homemade PyTorch from scratch.
☆558Updated last week
omkaark / simple-federated-learning
☆96Updated last year
seatedro / arxival
☆92Updated 7 months ago
ulrichstern / cuda-convnet
Alex Krizhevsky's original code from Google Code
☆196Updated 9 years ago
obadakhalili / tinygrad-tensor-puzzles
Solve puzzles to improve your tinygrad skills!
☆142Updated 5 months ago
haraschax / nograd
Gradient descent is cool and all, but what if we could delete it?
☆104Updated 2 weeks ago
nano-R1 / resources
Compiling useful links, papers, benchmarks, ideas, etc.
☆45Updated 5 months ago
tokenbender / avataRL
rl from zero pretrain, can it be done? yes.
☆250Updated this week
shivendrra / SmallLanguageModel
a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model
☆151Updated last year
N8python / mlx-pretrain
A simple MLX implementation for pretraining LLMs on Apple Silicon.
☆84Updated this week
EleutherAI / cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
☆812Updated 3 weeks ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆105Updated 5 months ago
smolorg / smoltropix
MLX port for xjdr's entropix sampler (mimics jax implementation)
☆63Updated 9 months ago
gautierdag / bpeasy
Fast bare-bones BPE for modern tokenizer training
☆164Updated 2 months ago
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆190Updated 2 months ago
nuwandavek / karpathify
☆92Updated 10 months ago
kmohan321 / Research_Papers
☆46Updated 4 months ago
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆79Updated 8 months ago
yacineMTB / just-large-models
Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.
☆44Updated last year