cene555 / ruCLIP-SB

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.
11Updated 2 years ago

Alternatives and similar repositories for ruCLIP-SB:

Users that are interested in ruCLIP-SB are comparing it to the libraries listed below