CMU-MultiComp-Lab / adv-mmml-courseLinks
☆37Updated last year
Alternatives and similar repositories for adv-mmml-course
Users that are interested in adv-mmml-course are comparing it to the libraries listed below
Sorting:
- ☆91Updated last year
- ☆29Updated last year
- Video descriptions of research papers relating to foundation models and scaling☆31Updated 2 years ago
- This repository holds code and other relevant files for the NeurIPS 2022 tutorial: Foundational Robustness of Foundation Models.☆70Updated 2 years ago
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b …☆74Updated last year
- In-the-wild Question Answering☆15Updated 2 years ago
- The official GitHub page for paper "NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional St…☆22Updated last year
- m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆28Updated last month
- (ACM MM24) This is the offical repository of GIST: Improving Parameter Efficient Fine Tuning via Knowledge Interaction.☆10Updated last year
- A curated list of vision-and-language pre-training (VLP). :-)☆59Updated 2 years ago
- ScrollNet for Continual Learning☆11Updated last year
- Toloka Visual Question Answering Challenge at WSDM Cup 2023☆31Updated last year
- Basic guidance on how to contribute to Papers with Code☆23Updated 3 years ago
- ☆60Updated 3 weeks ago
- ☆43Updated 2 weeks ago
- A curated list of Survey Papers on Deep Learning.☆12Updated last year
- Distributed Optimization Infra for learning CLIP models☆26Updated 8 months ago
- ☆50Updated 4 months ago
- ☆11Updated 2 months ago
- [CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings☆46Updated last year
- opentqa is a open framework of the textbook question answering, which includes xtqa, mcan, cmr, mfb, mutan.☆11Updated 4 years ago
- Code Example for Learning Multimodal Data Augmentation in Feature Space☆43Updated 2 years ago
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Updated last year
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Updated 3 years ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆86Updated last year
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆73Updated last year
- ☆51Updated last year
- code for the ddp tutorial☆32Updated 3 years ago
- Holistic evaluation of multimodal foundation models☆47Updated 9 months ago
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆34Updated 9 months ago