shreydan / VisionGPT2

Combining ViT and GPT-2 for image captioning. Trained on MS-COCO. The model was implemented mostly from scratch.
42Updated last year

Alternatives and similar repositories for VisionGPT2

Users that are interested in VisionGPT2 are comparing it to the libraries listed below

Sorting: