CMU-MultiComp-Lab / mmml-courseLinks
☆91Updated last year
Alternatives and similar repositories for mmml-course
Users that are interested in mmml-course are comparing it to the libraries listed below
Sorting:
- ☆36Updated last year
- ☆29Updated last year
- This repository holds code and other relevant files for the NeurIPS 2022 tutorial: Foundational Robustness of Foundation Models.☆71Updated 2 years ago
- A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"☆82Updated 3 years ago
- ☆98Updated 2 years ago
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆77Updated last month
- [TMLR 2022] High-Modality Multimodal Transformer☆117Updated 9 months ago
- https://slds-lmu.github.io/seminar_multimodal_dl/☆170Updated 2 years ago
- ☆42Updated last year
- [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆96Updated 11 months ago
- In-the-wild Question Answering☆15Updated 2 years ago
- ☆65Updated 3 years ago
- Reading list for Multimodal Large Language Models☆68Updated last year
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆48Updated last year
- A reading list of papers about Visual Question Answering.☆33Updated 2 years ago
- A curated list of vision-and-language pre-training (VLP). :-)☆59Updated 3 years ago
- ☆26Updated last year
- Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning☆197Updated last year
- This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and …☆304Updated 3 years ago
- A Survey on multimodal learning research.☆329Updated last year
- Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"☆66Updated 3 years ago
- ☆35Updated 3 years ago
- The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…☆229Updated 2 years ago
- ☆33Updated last year
- Holistic evaluation of multimodal foundation models☆48Updated 11 months ago
- Neuron Activation☆24Updated 8 months ago
- Open source code for AAAI 2023 Paper "BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning"☆166Updated 2 years ago
- ☆81Updated last year
- Visual Language Transformer Interpreter - An interactive visualization tool for interpreting vision-language transformers☆94Updated last year
- code for the ddp tutorial☆32Updated 3 years ago