yaohungt / Multimodal-Transformer
[ACL'19] [PyTorch] Multimodal Transformer
☆847Updated 2 years ago
Alternatives and similar repositories for Multimodal-Transformer:
Users that are interested in Multimodal-Transformer are comparing it to the libraries listed below
- This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as mul…☆788Updated last year
- [NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning☆509Updated last year
- MMSA is a unified framework for Multimodal Sentiment Analysis.☆726Updated 2 weeks ago
- ☆199Updated 3 years ago
- This is the repository for "Efficient Low-rank Multimodal Fusion with Modality-Specific Factors", Liu and Shen, et. al. ACL 2018☆258Updated 4 years ago
- MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis☆213Updated last year
- Pytorch Implementation of Tensor Fusion Networks for multimodal sentiment analysis.☆180Updated 4 years ago
- Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"☆1,428Updated 9 months ago
- Attention-based multimodal fusion for sentiment analysis☆333Updated 9 months ago
- PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".☆939Updated 2 years ago
- Codes for paper "Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis"☆199Updated 2 years ago
- This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information M…☆171Updated last year
- Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".☆741Updated last year
- A curated list of Multimodal Related Research.☆1,331Updated last year
- Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"☆788Updated 3 years ago
- Multi Task Vision and Language☆803Updated 2 years ago
- Recent Advances in Vision and Language PreTrained Models (VL-PTMs)☆1,148Updated 2 years ago
- [AAAI 2018] Memory Fusion Network for Multi-view Sequential Learning☆112Updated 4 years ago
- Deep Modular Co-Attention Networks for Visual Question Answering☆447Updated 4 years ago
- Code for ALBEF: a new vision-language pre-training method☆1,599Updated 2 years ago
- ☆178Updated last year
- Paper List for Multimodal Sentiment Analysis☆98Updated 4 years ago
- A Tool for extracting multimodal features from videos.☆152Updated last year
- BLOCK (AAAI 2019), with a multimodal fusion library for deep learning models☆348Updated 5 years ago
- Supervised Multimodal Bitransformers for Classifying Images and Text☆248Updated 3 years ago
- ☆473Updated 2 years ago
- METER: A Multimodal End-to-end TransformER Framework☆365Updated 2 years ago
- ☆162Updated 4 years ago
- [CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…☆713Updated last year
- Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"☆530Updated last year