forbes110 / PLEDGE--Paragraph-LEvel-image-Description-GEneration
View external linksLinks

Apply an end-to-end model structure (ViT + GPT) to describe images in more detail, rather than traditional image captioning that only provides object detections or a few simple sentences.
11Jan 15, 2025Updated last year

Alternatives and similar repositories for PLEDGE--Paragraph-LEvel-image-Description-GEneration

Users that are interested in PLEDGE--Paragraph-LEvel-image-Description-GEneration are comparing it to the libraries listed below

Sorting:

Are these results useful?