MILVLG / rosita
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
☆56Updated last year
Alternatives and similar repositories for rosita:
Users that are interested in rosita are comparing it to the libraries listed below
- Human-like Controllable Image Captioning with Verb-specific Semantic Roles.