This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .
☆84Jun 16, 2025Updated 11 months ago
Alternatives and similar repositories for Multimodality-Representation-Learning
Users that are interested in Multimodality-Representation-Learning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆25Mar 13, 2026Updated 2 months ago
- Large Visual Language Model(LVLM), Large Language Model(LLM), Multimodal Large Language Model(MLLM), Alignment, Agent, AI System, Survey☆21Jul 27, 2025Updated 9 months ago
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆101Apr 30, 2024Updated 2 years ago
- [ACCV 2024] ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes 🚀🚀🚀☆37Jan 21, 2025Updated last year
- [⭐ CVPR 2025 Highlight ⭐] Official Implementation of the paper STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing fro…☆31Apr 22, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking☆33Mar 14, 2025Updated last year
- Code for paper "Cross-Domain Slot Filling as Machine Reading Comprehension" in IJCAI 2021☆11Aug 24, 2021Updated 4 years ago
- ☆10Oct 28, 2019Updated 6 years ago
- Source code of our EMNLP 2022 paper: Co-guiding Net: Achieving Mutual Guidances between Multiple Intent Detection and Slot Filling via He…☆12Nov 14, 2022Updated 3 years ago
- Joint learning of images and text via maximization of mutual information☆19Dec 14, 2021Updated 4 years ago
- DSTC9 Submission☆16Apr 12, 2021Updated 5 years ago
- Apps built using Inspired Cognition's Critique.☆57Mar 6, 2023Updated 3 years ago
- A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)☆12Aug 11, 2025Updated 9 months ago
- [AAAI 2024] DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning☆15Apr 29, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆70Jul 2, 2025Updated 10 months ago
- Official repository of paper titled "UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalitie…☆167Jan 19, 2026Updated 4 months ago
- Code for "SLIM: Explicit Slot-Intent Mapping with BERT for Joint Multi-Intent Detection and Slot Filling"☆18Nov 22, 2022Updated 3 years ago
- [ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"☆109Jul 15, 2023Updated 2 years ago
- A Survey on Benchmarks of Multimodal Large Language Models☆149Jul 1, 2025Updated 10 months ago
- ☆36Jan 9, 2025Updated last year
- Code for "Out-of-Distribution Detection using Synthetic Data Generation"☆20Feb 6, 2025Updated last year
- [CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".☆307Apr 3, 2024Updated 2 years ago
- This is the code for the Submission 3358 at NeurIPS 2022.☆22Dec 21, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆83Jan 19, 2026Updated 4 months ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Jun 3, 2024Updated last year
- The dataset and code for PeerSum at EMNLP'23.☆16Oct 20, 2025Updated 7 months ago
- Validating image classification benchmark results on ViTs and ResNets (v2)☆13Nov 3, 2022Updated 3 years ago
- Uncertainty-Guided Pseudo-Labelling with Model Averaging☆11Mar 17, 2026Updated 2 months ago
- [InterSpeech 2024] Official code repository of paper titled "Bird Whisperer: Leveraging Large Pre-trained Acoustic Model for Bird Call Cl…☆39Dec 11, 2024Updated last year
- About Data and Codes for EMNLP 2023 System Demo Paper "QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking"☆19Dec 19, 2023Updated 2 years ago
- [EMNLP 2024] Official repository for paper "From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis"☆21Oct 15, 2024Updated last year
- Code and dataset release for the paper "Unstructured Evidence Attribution for Long Context Query Focused Summarization"☆11Nov 3, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training☆30Jun 20, 2023Updated 2 years ago
- Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"☆26Jun 15, 2022Updated 3 years ago
- Implementation for NeurIPS 2024 paper "SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Models" (ht…☆14Dec 23, 2024Updated last year
- ☆17Nov 18, 2024Updated last year
- ☆53Sep 13, 2023Updated 2 years ago
- ☆19Sep 18, 2025Updated 8 months ago
- The official code and model for ACL 2023 paper 'mCLIP: Multilingual CLIP via Cross-lingual Transfer'☆10Jan 23, 2024Updated 2 years ago