Bensmail-anis / developing-gpt2-124M-from-scratchView on GitHub
A custom implementation of OpenAI's GPT-2(124M) from scratch, following the paper "Language Models are Unsupervised Multitask Learners".We train the model on the FineWeb dataset -10 billion tokens.
11Apr 10, 2025Updated 11 months ago

Alternatives and similar repositories for developing-gpt2-124M-from-scratch

Users that are interested in developing-gpt2-124M-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?