cene555 / ruCLIP-SB

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.
11Updated 2 years ago

Related projects

Alternatives and complementary repositories for ruCLIP-SB