gchochla / VAuLTLinks

This repo contains the original implementation of VAuLT, the Vision-and-Augmented-Language Transformer. We provide instructions to download some multimodal social-media datasets, and scripts to experiment with. VAuLT is a stack of Transformers, a LM like BERT that preprocesses the text input of ViLT
18Updated last month

Alternatives and similar repositories for VAuLT

Users that are interested in VAuLT are comparing it to the libraries listed below

Sorting: