Bensmail-anis / developing-gpt2-124M-from-scratch
View external linksLinks

A custom implementation of OpenAI's GPT-2(124M) from scratch, following the paper "Language Models are Unsupervised Multitask Learners".We train the model on the FineWeb dataset -10 billion tokens.
11Apr 10, 2025Updated 10 months ago

Alternatives and similar repositories for developing-gpt2-124M-from-scratch

Users that are interested in developing-gpt2-124M-from-scratch are comparing it to the libraries listed below

Sorting:

Are these results useful?