CMU-MultiComp-Lab / mmml-courseLinks
☆98Updated last year
Alternatives and similar repositories for mmml-course
Users that are interested in mmml-course are comparing it to the libraries listed below
Sorting:
- ☆41Updated last year
- ☆30Updated 2 years ago
- [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆98Updated last year
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆83Updated 7 months ago
- This repository holds code and other relevant files for the NeurIPS 2022 tutorial: Foundational Robustness of Foundation Models.☆72Updated 3 years ago
- ☆101Updated 3 years ago
- https://slds-lmu.github.io/seminar_multimodal_dl/☆171Updated 3 years ago
- [TMLR 2022] High-Modality Multimodal Transformer☆117Updated last year
- ☆49Updated 2 years ago
- Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning☆201Updated last year
- A curated list of vision-and-language pre-training (VLP). :-)☆62Updated 3 years ago
- A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"☆83Updated 3 years ago
- Reading list for Multimodal Large Language Models☆69Updated 2 years ago
- Awesome Large Vision-Language Model: A Curated List of Large Vision-Language Model☆39Updated 5 months ago
- Open source code for AAAI 2023 Paper "BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning"☆168Updated 2 years ago
- A Survey on multimodal learning research.☆334Updated 2 years ago
- code for the ddp tutorial☆32Updated 3 years ago
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆49Updated last year
- ICLR 2023 Paper submission analysis from https://openreview.net/group?id=ICLR.cc/2023/Conference☆107Updated 3 years ago
- The official GitHub page for paper "NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional St…☆24Updated last year
- The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…☆233Updated 3 years ago
- In-the-wild Question Answering☆15Updated 2 years ago
- ☆66Updated 3 years ago
- ☆82Updated last year
- This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and …☆323Updated 4 years ago
- OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models☆151Updated 3 years ago
- ☆27Updated last year
- ☆72Updated 4 years ago
- Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone☆131Updated 2 years ago
- Toloka Visual Question Answering Challenge at WSDM Cup 2023☆31Updated last year