CMU-MultiComp-Lab / mmml-course
☆91Updated last year
Alternatives and similar repositories for mmml-course:
Users that are interested in mmml-course are comparing it to the libraries listed below
- ☆37Updated last year
- ☆28Updated last year
- [TMLR 2022] High-Modality Multimodal Transformer☆115Updated 6 months ago
- [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆96Updated 8 months ago
- Holistic evaluation of multimodal foundation models☆47Updated 8 months ago
- [T-PAMI] A curated list of self-supervised multimodal learning resources.☆252Updated 8 months ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆83Updated last year
- ☆41Updated 9 months ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆206Updated 2 years ago
- Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone☆129Updated last year
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆73Updated last year
- ☆155Updated 3 years ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated 5 months ago
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆78Updated 2 years ago
- Open source code for AAAI 2023 Paper "BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning"☆162Updated last year
- ☆68Updated last year
- [TPAMI] Searching prompt modules for parameter-efficient transfer learning.☆229Updated last year
- A curated list of vision-and-language pre-training (VLP). :-)☆58Updated 2 years ago
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆48Updated last year
- ☆68Updated 6 years ago
- [NAACL 2025] Towards Rationality in Language and Multimodal Agents: A Survey☆27Updated 2 months ago
- This repository holds code and other relevant files for the NeurIPS 2022 tutorial: Foundational Robustness of Foundation Models.☆70Updated 2 years ago
- ICLR 2023 Paper submission analysis from https://openreview.net/group?id=ICLR.cc/2023/Conference☆105Updated 2 years ago
- Video descriptions of research papers relating to foundation models and scaling☆31Updated 2 years ago
- Reading list for Multimodal Large Language Models☆68Updated last year
- Residual Prompt Tuning: a method for faster and better prompt tuning.☆54Updated last year
- ☆118Updated 2 years ago
- Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning☆191Updated last year
- Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"☆63Updated 2 years ago
- [ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"☆236Updated last year