junchen14 / Multi-Modal-Transformer

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised learning models. Additionally, it also collects many useful tutorials and tools in these related domains.
226Updated 2 years ago

Alternatives and similar repositories for Multi-Modal-Transformer:

Users that are interested in Multi-Modal-Transformer are comparing it to the libraries listed below