This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .
☆84Jun 16, 2025Updated 10 months ago
Alternatives and similar repositories for Multimodality-Representation-Learning
Users that are interested in Multimodality-Representation-Learning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆25Mar 13, 2026Updated last month
- [Findings of ACL'2023] Improving Contrastive Learning of Sentence Embeddings from AI Feedback☆40Aug 14, 2023Updated 2 years ago
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆101Apr 30, 2024Updated 2 years ago
- This is code for the EMNLP 2022 Paper "UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation".☆10Apr 30, 2023Updated 3 years ago
- [ACCV 2024] ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes 🚀🚀🚀☆37Jan 21, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [⭐ CVPR 2025 Highlight ⭐] Official Implementation of the paper STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing fro…☆30Apr 22, 2025Updated last year
- Official code repository of paper titled "Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Visio…☆34May 11, 2025Updated 11 months ago
- Code for Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking☆33Mar 14, 2025Updated last year
- Code for paper "Cross-Domain Slot Filling as Machine Reading Comprehension" in IJCAI 2021☆11Aug 24, 2021Updated 4 years ago
- Source code of our EMNLP 2022 paper: Co-guiding Net: Achieving Mutual Guidances between Multiple Intent Detection and Slot Filling via He…☆12Nov 14, 2022Updated 3 years ago
- [ECCV'22] Official repository of paper titled "Class-agnostic Object Detection with Multi-modal Transformer".☆315May 9, 2023Updated 2 years ago
- Joint learning of images and text via maximization of mutual information☆19Dec 14, 2021Updated 4 years ago
- Apps built using Inspired Cognition's Critique.☆57Mar 6, 2023Updated 3 years ago
- A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes (WACV 2025)☆12Aug 11, 2025Updated 8 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [ACL 2025 🔥] Time Travel is a Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts☆19May 22, 2025Updated 11 months ago
- DDAM-PS: Diligent Domain Adaptive Mixer for Person Search -- WACV2024☆13Feb 28, 2024Updated 2 years ago
- ☆70Jul 2, 2025Updated 10 months ago
- Official repository of paper titled "UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalitie…☆165Jan 19, 2026Updated 3 months ago
- This is a repo listing some must-read papers on *AI-driven MOOCs* or *Intelligent Education* published in recent years, mainly contribute…☆17Jun 8, 2022Updated 3 years ago
- A curated list of vision-and-language pre-training (VLP). :-)☆62Jul 6, 2022Updated 3 years ago
- Code for "SLIM: Explicit Slot-Intent Mapping with BERT for Joint Multi-Intent Detection and Slot Filling"☆18Nov 22, 2022Updated 3 years ago
- A Few-Shot Learning based Approach to Multimodal Social Relation Extraction☆14Jan 17, 2023Updated 3 years ago
- A Hybrid Change Encoder for Remote Sensing Change Detection (IGARSS 2024)☆18Jun 26, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆36Jan 9, 2025Updated last year
- Code for "Out-of-Distribution Detection using Synthetic Data Generation"☆21Feb 6, 2025Updated last year
- [CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".☆307Apr 3, 2024Updated 2 years ago
- This is the code for the Submission 3358 at NeurIPS 2022.☆22Dec 21, 2022Updated 3 years ago
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆82Jan 19, 2026Updated 3 months ago
- Uncertainty-Guided Pseudo-Labelling with Model Averaging☆11Mar 17, 2026Updated last month
- [ICCV'23 Main Track, WECIA'23 Oral] Official repository of paper titled "Self-regulating Prompts: Foundational Model Adaptation without F…☆287Sep 28, 2023Updated 2 years ago
- [InterSpeech 2024] Official code repository of paper titled "Bird Whisperer: Leveraging Large Pre-trained Acoustic Model for Bird Call Cl…☆39Dec 11, 2024Updated last year
- 知识图谱基础设施☆11Jul 25, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆17Nov 3, 2024Updated last year
- About Data and Codes for EMNLP 2023 System Demo Paper "QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking"☆19Dec 19, 2023Updated 2 years ago
- ☆11Oct 29, 2024Updated last year
- [NeurIPS 2022] Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings☆22Jan 30, 2023Updated 3 years ago
- ☆27Sep 3, 2024Updated last year
- FedCMR: Federated Cross-Modal Retrieval 的代码(the official implementation of FedCMR: Federated Cross-Modal Retrieval)☆17Oct 17, 2025Updated 6 months ago
- [EMNLP 2024] Official repository for paper "From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis"☆21Oct 15, 2024Updated last year