CMU-MultiComp-Lab / mmml-courseLinks
☆93Updated last year
Alternatives and similar repositories for mmml-course
Users that are interested in mmml-course are comparing it to the libraries listed below
Sorting:
- ☆38Updated last year
 - ☆30Updated 2 years ago
 - [TMLR 2022] High-Modality Multimodal Transformer☆117Updated last year
 - https://slds-lmu.github.io/seminar_multimodal_dl/☆171Updated 2 years ago
 - ☆101Updated 3 years ago
 - This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆81Updated 4 months ago
 - This repository holds code and other relevant files for the NeurIPS 2022 tutorial: Foundational Robustness of Foundation Models.☆72Updated 2 years ago
 - [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆97Updated last year
 - A curated list of vision-and-language pre-training (VLP). :-)☆59Updated 3 years ago
 - Reading list for Multimodal Large Language Models☆68Updated 2 years ago
 - ☆46Updated 2 years ago
 - ICLR 2023 Paper submission analysis from https://openreview.net/group?id=ICLR.cc/2023/Conference☆106Updated 3 years ago
 - Video descriptions of research papers relating to foundation models and scaling☆30Updated 2 years ago
 - Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning☆199Updated last year
 - ☆81Updated last year
 - code for the ddp tutorial☆32Updated 3 years ago
 - A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"☆83Updated 3 years ago
 - Toloka Visual Question Answering Challenge at WSDM Cup 2023☆31Updated last year
 - Google Research☆46Updated 3 years ago
 - Papers, authors and author affiliations from ICML, NeurIPS and ICLR 2006-2024☆43Updated 6 months ago
 - Open source code for AAAI 2023 Paper "BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning"☆166Updated 2 years ago
 - The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…☆230Updated 3 years ago
 - ☆35Updated 3 years ago
 - Basic guidance on how to contribute to Papers with Code☆24Updated 3 years ago
 - In-the-wild Question Answering☆15Updated 2 years ago
 - Conference schedule, top papers, and analysis of the data for NeurIPS 2023!☆121Updated last year
 - Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Updated 3 years ago
 - Visual Language Transformer Interpreter - An interactive visualization tool for interpreting vision-language transformers☆97Updated 2 years ago
 - ☆64Updated 3 years ago
 - ☆120Updated 2 years ago