[CVPR 2024] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
☆101Mar 13, 2024Updated 2 years ago
Alternatives and similar repositories for M2PT
Users that are interested in M2PT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025] Explore the Limits of Omni-modal Pretraining at Scale☆124Sep 2, 2024Updated last year
- InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions☆133Feb 7, 2024Updated 2 years ago
- ☆21Jan 21, 2025Updated last year
- I2M2: Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning (NeurIPS 2024)☆22Oct 30, 2024Updated last year
- Source code of the paper: Overlapped Trajectory-Enhanced Visual Tracking☆11Sep 3, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [Pattern Recognition 2024] Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models, Dong Li, Jiandon…☆18Jan 18, 2025Updated last year
- MMPD Dataset from ECCV'2024 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"☆21Jul 15, 2024Updated last year
- [CVPR 2024 & TPAMI 2025] UniRepLKNet☆1,072Aug 10, 2025Updated 8 months ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Jul 29, 2023Updated 2 years ago
- ☆10Oct 20, 2023Updated 2 years ago
- [CVPR 2024] OneLLM: One Framework to Align All Modalities with Language☆665Oct 22, 2024Updated last year
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆369Jul 24, 2025Updated 9 months ago
- [CVPR2024] FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning☆17Apr 18, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The repo for "MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance", ICML 2024☆54Jun 28, 2024Updated last year
- This is an official implementation of our work, Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on V…☆17Sep 24, 2025Updated 7 months ago
- An official implementation of "Distribution-Consistent Modal Recovering for Incomplete Multimodal Learning" in PyTorch. (ICCV 2023)☆36Sep 28, 2023Updated 2 years ago
- The code for the paper "LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling" (NeurIPS'24).☆14Dec 25, 2024Updated last year
- Convolutional Initialization for Data-Efficient Vision Transformers☆15Dec 9, 2025Updated 4 months ago
- Towards Unified and Effective Domain Generalization☆32Nov 27, 2023Updated 2 years ago
- Keras (TensorFlow v2) reimplementation of Re-parameterized Large Kernel Network (RepLKNet)☆17Dec 8, 2022Updated 3 years ago
- NeurIPS'2023 official implementation code☆70Nov 11, 2023Updated 2 years ago
- The repo for "Diagnosing and Re-learning for Balanced Multi-modal Learning", ECCV 2024☆30Jul 30, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆17Apr 22, 2025Updated last year
- Video + CLIP Baseline for Ego4D Long Term Action Anticipation Challenge (CVPR 2022)☆15Jul 4, 2022Updated 3 years ago
- The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024☆61Nov 5, 2024Updated last year
- Event based Sign-Language-Translation☆19Feb 27, 2026Updated 2 months ago
- ☆40Jul 20, 2024Updated last year
- Accepted at ICCV '23☆15Oct 4, 2023Updated 2 years ago
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems☆85Jan 9, 2024Updated 2 years ago
- A visual analysis tool to support a unified model evaluation for different computer vision tasks, including classification, object detect…☆18Dec 5, 2023Updated 2 years ago
- ☆12Dec 19, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆18Oct 11, 2024Updated last year
- ☆17Apr 18, 2025Updated last year
- This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…☆11Oct 9, 2024Updated last year
- [NeurIPS 2024] Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning☆72Feb 11, 2025Updated last year
- 1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…☆227Aug 23, 2024Updated last year
- Official PyTorch implementation of DiffTF (Accepted by ICLR2024)☆197Jul 12, 2024Updated last year
- ☆15Aug 1, 2021Updated 4 years ago