Bensmail-anis / developing-gpt2-124M-from-scratchLinks

A custom implementation of OpenAI's GPT-2(124M) from scratch, following the paper "Language Models are Unsupervised Multitask Learners".We train the model on the FineWeb dataset -10 billion tokens.
11Updated 2 months ago

Alternatives and similar repositories for developing-gpt2-124M-from-scratch

Users that are interested in developing-gpt2-124M-from-scratch are comparing it to the libraries listed below

Sorting: