This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .
☆84Jun 16, 2025Updated last year
Alternatives and similar repositories for Multimodality-Representation-Learning
Users that are interested in Multimodality-Representation-Learning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆27Mar 13, 2026Updated 3 months ago
- Large Visual Language Model(LVLM), Large Language Model(LLM), Multimodal Large Language Model(MLLM), Alignment, Agent, AI System, Survey☆21Jul 27, 2025Updated 11 months ago
- [⭐ CVPR 2025 Highlight ⭐] Official Implementation of the paper STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing fro…☆31Apr 22, 2025Updated last year
- Official code repository of paper titled "Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Visio…☆34May 11, 2025Updated last year
- Code for Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking☆33Mar 14, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code for paper "Cross-Domain Slot Filling as Machine Reading Comprehension" in IJCAI 2021☆11Aug 24, 2021Updated 4 years ago
- ☆10Oct 28, 2019Updated 6 years ago
- Source code of our EMNLP 2022 paper: Co-guiding Net: Achieving Mutual Guidances between Multiple Intent Detection and Slot Filling via He…☆12Nov 14, 2022Updated 3 years ago
- [ECCV'22] Official repository of paper titled "Class-agnostic Object Detection with Multi-modal Transformer".☆315May 9, 2023Updated 3 years ago
- DSTC9 Submission☆16Apr 12, 2021Updated 5 years ago
- Apps built using Inspired Cognition's Critique.☆56Mar 6, 2023Updated 3 years ago
- A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)☆12Aug 11, 2025Updated 10 months ago
- [AAAI 2024] DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning☆15Apr 29, 2024Updated 2 years ago
- Official repository of paper titled "UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalitie…☆170Jan 19, 2026Updated 5 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A curated list of vision-and-language pre-training (VLP). :-)☆62Jul 6, 2022Updated 3 years ago
- Code for "SLIM: Explicit Slot-Intent Mapping with BERT for Joint Multi-Intent Detection and Slot Filling"☆19Nov 22, 2022Updated 3 years ago
- A Few-Shot Learning based Approach to Multimodal Social Relation Extraction☆14Jan 17, 2023Updated 3 years ago
- A Survey on Benchmarks of Multimodal Large Language Models☆155Jun 12, 2026Updated 2 weeks ago
- Code for "Out-of-Distribution Detection using Synthetic Data Generation"☆20Feb 6, 2025Updated last year
- ☆39Jan 9, 2025Updated last year
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆87Jan 19, 2026Updated 5 months ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Jun 3, 2024Updated 2 years ago
- The dataset and code for PeerSum at EMNLP'23.☆16Oct 20, 2025Updated 8 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Uncertainty-Guided Pseudo-Labelling with Model Averaging☆11Mar 17, 2026Updated 3 months ago
- ☆11Oct 29, 2024Updated last year
- [NeurIPS 2022] Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings☆22Jan 30, 2023Updated 3 years ago
- FedCMR: Federated Cross-Modal Retrieval 的代码(the official implementation of FedCMR: Federated Cross-Modal Retrieval)☆17Oct 17, 2025Updated 8 months ago
- [EMNLP 2024] Official repository for paper "From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis"☆22Oct 15, 2024Updated last year
- Code and dataset release for the paper "Unstructured Evidence Attribution for Long Context Query Focused Summarization"☆12Nov 3, 2025Updated 7 months ago
- Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"☆26Jun 15, 2022Updated 4 years ago
- Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training☆30Jun 20, 2023Updated 3 years ago
- ☆17Nov 18, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆53Sep 13, 2023Updated 2 years ago
- ☆48Jun 25, 2025Updated last year
- The official code and model for ACL 2023 paper 'mCLIP: Multilingual CLIP via Cross-lingual Transfer'☆10Jan 23, 2024Updated 2 years ago
- ☆43Sep 3, 2024Updated last year
- ☆10Apr 7, 2024Updated 2 years ago
- [ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".☆12Oct 11, 2024Updated last year
- ☆12Jan 10, 2025Updated last year