gchochla / VAuLT
View external linksLinks

This repo contains the original implementation of VAuLT, the Vision-and-Augmented-Language Transformer. We provide instructions to download some multimodal social-media datasets, and scripts to experiment with. VAuLT is a stack of Transformers, a LM like BERT that preprocesses the text input of ViLT
18Sep 23, 2025Updated 4 months ago

Alternatives and similar repositories for VAuLT

Users that are interested in VAuLT are comparing it to the libraries listed below

Sorting:

Are these results useful?