[CVPR 2024] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
☆101Mar 13, 2024Updated last year
Alternatives and similar repositories for M2PT
Users that are interested in M2PT are comparing it to the libraries listed below
Sorting:
- [ICCV 2025] Explore the Limits of Omni-modal Pretraining at Scale☆123Sep 2, 2024Updated last year
- [Pattern Recognition 2024] Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models, Dong Li, Jiandon…☆18Jan 18, 2025Updated last year
- ☆19Jan 21, 2025Updated last year
- MMPD Dataset from ECCV'2024 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"☆21Jul 15, 2024Updated last year
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Jul 29, 2023Updated 2 years ago
- I2M2: Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning (NeurIPS 2024)☆22Oct 30, 2024Updated last year
- code for "Multitask Vision-Language Prompt Tuning" https://arxiv.org/abs/2211.11720☆56Jun 5, 2024Updated last year
- [ICLR 2026] SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models☆74Jan 29, 2026Updated last month
- NeurIPS'2023 official implementation code☆68Nov 11, 2023Updated 2 years ago
- ☆27Jan 23, 2024Updated 2 years ago
- Repository for the PopulAtion Parameter Averaging (PAPA) paper☆31Apr 11, 2024Updated last year
- InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions☆132Feb 7, 2024Updated 2 years ago
- [WACV 2025-Oral Presentation] Test-Time Adaptation in Point Clouds: Leveraging Sampling Variation with Weight Averaging☆12Mar 31, 2025Updated 11 months ago
- ☆10Oct 20, 2023Updated 2 years ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- A much powerful probing method to tune your model with promising performance and linear probing training cost!☆15Jul 26, 2023Updated 2 years ago
- SSL Layerwise analysis for speech deepfake detection☆32Aug 5, 2025Updated 6 months ago
- [CVPR 2024] OneLLM: One Framework to Align All Modalities with Language☆666Oct 22, 2024Updated last year
- Source code of the paper: Overlapped Trajectory-Enhanced Visual Tracking☆11Sep 3, 2024Updated last year
- A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes☆11Oct 19, 2023Updated 2 years ago
- Open source repository for the code accompanying the paper 'PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations'.…☆23Feb 4, 2021Updated 5 years ago
- ☆16Nov 29, 2024Updated last year
- The repo for "Diagnosing and Re-learning for Balanced Multi-modal Learning", ECCV 2024☆30Jul 30, 2024Updated last year
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆368Jul 24, 2025Updated 7 months ago
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems☆85Jan 9, 2024Updated 2 years ago
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆63May 15, 2025Updated 9 months ago
- Video + CLIP Baseline for Ego4D Long Term Action Anticipation Challenge (CVPR 2022)☆15Jul 4, 2022Updated 3 years ago
- real-time speech enhance☆17Jan 23, 2024Updated 2 years ago
- ☆12Feb 22, 2025Updated last year
- ☆14Aug 1, 2021Updated 4 years ago
- Event based Sign-Language-Translation☆19Updated this week
- The code for the paper "LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling" (NeurIPS'24).☆13Dec 25, 2024Updated last year
- ☆17Apr 18, 2025Updated 10 months ago
- The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024☆59Nov 5, 2024Updated last year
- VoiceLDM: Text-to-Speech with Environmental Context☆191Aug 9, 2024Updated last year
- [ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?☆176Apr 28, 2025Updated 10 months ago
- ☆60Oct 14, 2024Updated last year
- ☆40Jul 20, 2024Updated last year
- Accepted at ICCV '23☆15Oct 4, 2023Updated 2 years ago