AILab-CVC/M2PT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AILab-CVC/M2PT)

AILab-CVC / M2PT

[CVPR 2024] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

☆101

Alternatives and similar repositories for M2PT

Users that are interested in M2PT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

invictus717 / MiCo
View on GitHub
[ICCV 2025] Explore the Limits of Omni-modal Pretraining at Scale
☆124Sep 2, 2024Updated last year
invictus717 / InteractiveVideo
View on GitHub
InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions
☆133Feb 7, 2024Updated 2 years ago
njustkmg / NeurIPS24-LFM
View on GitHub
☆20Jan 21, 2025Updated last year
OrigamiSL / OTETrack
View on GitHub
Source code of the paper: Overlapped Trajectory-Enhanced Visual Tracking
☆11Sep 3, 2024Updated last year
JiazuoYu / Fines
View on GitHub
Code for paper "FineRS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning" Neurips2025.
☆15Jan 29, 2026Updated 5 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
jin-s13 / MMPD-Dataset
View on GitHub
MMPD Dataset from ECCV'2024 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"
☆21Jul 15, 2024Updated 2 years ago
AILab-CVC / UniRepLKNet
View on GitHub
[CVPR 2024 & TPAMI 2025] UniRepLKNet
☆1,072Aug 10, 2025Updated 11 months ago
codezakh / LilT
View on GitHub
[ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning
☆40Jul 29, 2023Updated 2 years ago
csuhan / OneLLM
View on GitHub
[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language
☆666Oct 22, 2024Updated last year
NJUDeepEngine / CAEF
View on GitHub
Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"
☆11Oct 11, 2024Updated last year
mingzeG / Moment-Probing
View on GitHub
A much powerful probing method to tune your model with promising performance and linear probing training cost!
☆15Jul 26, 2023Updated 2 years ago
baaivision / EVE
View on GitHub
EVE Series: Encoder-Free Vision-Language Models from BAAI
☆374Jul 24, 2025Updated 11 months ago
zhoujiahuan1991 / CVPR2024-FCS
View on GitHub
[CVPR2024] FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning
☆20Apr 18, 2025Updated last year
HaoWang420 / Gradient-guided-Modality-Decoupling
View on GitHub
☆27Nov 6, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
GeWu-Lab / MMPareto_ICML2024
View on GitHub
The repo for "MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance", ICML 2024
☆55Jun 28, 2024Updated 2 years ago
mdswyz / DiCMoR
View on GitHub
An official implementation of "Distribution-Consistent Modal Recovering for Incomplete Multimodal Learning" in PyTorch. (ICCV 2023)
☆37Sep 28, 2023Updated 2 years ago
sail-sg / MMCBench
View on GitHub
☆27Jan 23, 2024Updated 2 years ago
invictus717 / UniDG
View on GitHub
Towards Unified and Effective Domain Generalization
☆34Nov 27, 2023Updated 2 years ago
shkarupa-alex / tfreplknet
View on GitHub
Keras (TensorFlow v2) reimplementation of Re-parameterized Large Kernel Network (RepLKNet)
☆17Dec 8, 2022Updated 3 years ago
invictus717 / MetaTransformer
View on GitHub
Meta-Transformer for Unified Multimodal Learning
☆1,650Dec 5, 2023Updated 2 years ago
haoyi-duan / DG-SCT
View on GitHub
NeurIPS'2023 official implementation code
☆70Nov 11, 2023Updated 2 years ago
GeWu-Lab / Diagnosing_Relearning_ECCV2024
View on GitHub
The repo for "Diagnosing and Re-learning for Balanced Multi-modal Learning", ECCV 2024
☆29Jul 30, 2024Updated last year
srijandas07 / clip_baseline_LTA_Ego4d
View on GitHub
Video + CLIP Baseline for Ego4D Long Term Action Anticipation Challenge (CVPR 2022)
☆15Jul 4, 2022Updated 4 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
GeWu-Lab / Valuate-and-Enhance-Multimodal-Cooperation
View on GitHub
The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024
☆62Nov 5, 2024Updated last year
uclaml / COPS
View on GitHub
The official implementation of Cross-Task Experience Sharing (COPS)
☆29Oct 23, 2024Updated last year
0nutation / SpeechAgents
View on GitHub
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems
☆87Jan 9, 2024Updated 2 years ago
tangtaogo / alignmif
View on GitHub
☆40Jul 20, 2024Updated 2 years ago
thu-vis / Uni-Evaluator
View on GitHub
A visual analysis tool to support a unified model evaluation for different computer vision tasks, including classification, object detect…
☆18Dec 5, 2023Updated 2 years ago
MIS-DevWorks / FBR
View on GitHub
This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…
☆11Oct 9, 2024Updated last year
ZikunZhou / GTELT
View on GitHub
An official implementation for "Global Tracking via Ensemble of Local Trackers"
☆11Mar 13, 2022Updated 4 years ago
GeWu-Lab / Stepping-Stones
View on GitHub
The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
☆18Oct 11, 2024Updated last year
LeapLabTHU / EfficientTrain
View on GitHub
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…
☆231Aug 23, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
OpenGVLab / LCL
View on GitHub
[NeurIPS 2024] Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
☆72Feb 11, 2025Updated last year
showlab / Exo2Ego-V
View on GitHub
☆61Apr 28, 2025Updated last year
mwatkins1970 / SAE_Feature_Interpretability_Tool
View on GitHub
A tool to assist in the interpretation of learned features in sparse autoencoders (in particular the four SAE's trained by Joseph Bloom o…
☆19Oct 4, 2024Updated last year
ziangcao0312 / DiffTF
View on GitHub
Official PyTorch implementation of DiffTF (Accepted by ICLR2024)
☆200Jul 12, 2024Updated 2 years ago
Sense-X / UniHead
View on GitHub
Unifying Visual Perception by Dispersible Points Learning (ECCV 2022)
☆52Aug 19, 2022Updated 3 years ago
fredfung007 / snlt
View on GitHub
☆15Dec 3, 2021Updated 4 years ago
YuxiaoWang-AI / PIHOT
View on GitHub
☆12Dec 19, 2024Updated last year