TheoCoombes / ClipCap

Using pretrained encoder and language models to generate captions from multimedia inputs.
95Updated last year

Related projects

Alternatives and complementary repositories for ClipCap