rpeloff / multimodal_one_shot_learningLinks
Code recipe for "Multimodal One-Shot Learning of Speech and Images"
☆11Updated 6 years ago
Alternatives and similar repositories for multimodal_one_shot_learning
Users that are interested in multimodal_one_shot_learning are comparing it to the libraries listed below
Sorting:
- ☆48Updated 6 years ago
- Multimodal classification solution for the SIGIR eCOM using Co-attention and transformer language models☆19Updated 4 years ago
- [ICLR 2019] Learning Factorized Multimodal Representations☆67Updated 4 years ago
- Pytorch implementation of 'See, Hear, and Read: Deep Aligned Representations'☆33Updated 6 years ago
- ☆17Updated 2 years ago
- Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition☆14Updated 3 years ago
- Generalized cross-modal NNs; new audiovisual benchmark (IEEE TNNLS 2019)☆27Updated 5 years ago
- DeepCU: Integrating Both Common and Unique Latent Information for Multimodal Sentiment Analysis, IJCAI-19☆19Updated 5 years ago
- Python code for the cross-modal retrieval system proposed at ACM MM '10 in "A New Approach to Cross-Modal Multimedia Retrieval"☆20Updated 10 years ago
- Multimodal Adversarial Network for Cross-modal Retrieval (PyTorch Code)☆30Updated 5 years ago
- Code for the paper: Audio-Visual Model Distillation Using Acoustic Images☆21Updated 2 years ago
- ☆64Updated 5 years ago
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆52Updated 3 years ago
- reproduce the results of Adversarial Cross-Modal retrieval (ACMR)☆23Updated 5 years ago
- Source code for training Gated Multimodal Units on MM-IMDb dataset☆95Updated 2 years ago
- Deep Multimodal Multilinear Fusion with High-order Polynomial Pooling☆26Updated 5 years ago
- Re-implementation of the work Livebot☆16Updated 5 years ago
- PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning☆88Updated 3 years ago
- Poet: Product-oriented Video Captioner for E-commerce☆12Updated 4 years ago
- Code and dataset of "MEmoR: A Dataset for Multimodal Emotion Reasoning in Videos" in MM'20.☆53Updated 2 years ago
- A GCN based visual question generation model☆13Updated 5 years ago
- Code for NAACL 2021 paper: MTAG: Modal-Temporal Attention Graph for Unaligned Human Multimodal Language Sequences☆42Updated 2 years ago
- MATLAB Implementation of Scatter Component Analysis (SCA) for Domain Generalization☆21Updated 6 years ago
- [AAAI 2018] Memory Fusion Network for Multi-view Sequential Learning☆114Updated 4 years ago
- Learning to Separate Object Sounds by Watching Unlabeled Video (ECCV 2018)☆51Updated 5 years ago
- ☆28Updated 3 years ago
- Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using…☆12Updated 4 years ago
- Implementation of "Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning" (https://arxiv.…☆26Updated 6 years ago
- Multi-modal Multi-label Emotion Recognition with Heterogeneous Hierarchical Message Passing☆17Updated 2 years ago
- ☆12Updated 8 years ago