This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .
☆84Jun 16, 2025Updated 11 months ago
Alternatives and similar repositories for Multimodality-Representation-Learning
Users that are interested in Multimodality-Representation-Learning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆26Mar 13, 2026Updated 2 months ago
- [Findings of ACL'2023] Improving Contrastive Learning of Sentence Embeddings from AI Feedback☆40Aug 14, 2023Updated 2 years ago
- [ACCV 2024] ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes 🚀🚀🚀☆37Jan 21, 2025Updated last year
- Code for Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking☆33Mar 14, 2025Updated last year
- [ECCV'22] Official repository of paper titled "Class-agnostic Object Detection with Multi-modal Transformer".☆315May 9, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Joint learning of images and text via maximization of mutual information☆19Dec 14, 2021Updated 4 years ago
- A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)☆12Aug 11, 2025Updated 10 months ago
- [ACL 2025 🔥] Time Travel is a Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts☆19May 22, 2025Updated last year
- DDAM-PS: Diligent Domain Adaptive Mixer for Person Search -- WACV2024☆13Feb 28, 2024Updated 2 years ago
- ☆70Jul 2, 2025Updated 11 months ago
- Official repository of paper titled "UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalitie…☆170Jan 19, 2026Updated 4 months ago
- ☆10May 16, 2025Updated last year
- This is a repo listing some must-read papers on *AI-driven MOOCs* or *Intelligent Education* published in recent years, mainly contribute…☆17Jun 8, 2022Updated 4 years ago
- A curated list of vision-and-language pre-training (VLP). :-)☆62Jul 6, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code for "SLIM: Explicit Slot-Intent Mapping with BERT for Joint Multi-Intent Detection and Slot Filling"☆18Nov 22, 2022Updated 3 years ago
- A Few-Shot Learning based Approach to Multimodal Social Relation Extraction☆14Jan 17, 2023Updated 3 years ago
- [ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"☆109Jul 15, 2023Updated 2 years ago
- ☆38Jan 9, 2025Updated last year
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆85Jan 19, 2026Updated 4 months ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Jun 3, 2024Updated 2 years ago
- Validating image classification benchmark results on ViTs and ResNets (v2)☆13Nov 3, 2022Updated 3 years ago
- [ICCV'23 Main Track, WECIA'23 Oral] Official repository of paper titled "Self-regulating Prompts: Foundational Model Adaptation without F…☆286Sep 28, 2023Updated 2 years ago
- A video captioning tool using S2VT method and attention mechanism (TensorFlow)☆15Oct 14, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [InterSpeech 2024] Official code repository of paper titled "Bird Whisperer: Leveraging Large Pre-trained Acoustic Model for Bird Call Cl…☆39Dec 11, 2024Updated last year
- ☆17Nov 3, 2024Updated last year
- ☆11Oct 29, 2024Updated last year
- [NeurIPS 2022] Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings☆22Jan 30, 2023Updated 3 years ago
- ☆28Sep 3, 2024Updated last year
- [EMNLP 2024] Official repository for paper "From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis"☆22Oct 15, 2024Updated last year
- Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training☆30Jun 20, 2023Updated 2 years ago
- Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"☆26Jun 15, 2022Updated 3 years ago
- Implementation for NeurIPS 2024 paper "SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Models" (ht…☆14Dec 23, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆17Nov 18, 2024Updated last year
- ☆45Jun 25, 2025Updated 11 months ago
- The official code and model for ACL 2023 paper 'mCLIP: Multilingual CLIP via Cross-lingual Transfer'☆10Jan 23, 2024Updated 2 years ago
- ☆43Sep 3, 2024Updated last year
- ☆10Apr 7, 2024Updated 2 years ago
- [ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".☆12Oct 11, 2024Updated last year
- ☆12Jan 10, 2025Updated last year