CMU-MultiComp-Lab / adv-mmml-course
☆37Updated last year
Alternatives and similar repositories for adv-mmml-course
Users that are interested in adv-mmml-course are comparing it to the libraries listed below
Sorting:
- ☆91Updated last year
- ☆28Updated last year
- Video descriptions of research papers relating to foundation models and scaling☆31Updated 2 years ago
- ☆11Updated last month
- ☆45Updated 3 months ago
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆74Updated last year
- Toloka Visual Question Answering Challenge at WSDM Cup 2023☆31Updated last year
- This repository holds code and other relevant files for the NeurIPS 2022 tutorial: Foundational Robustness of Foundation Models.☆70Updated 2 years ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆54Updated 5 months ago
- A curated list of vision-and-language pre-training (VLP). :-)☆58Updated 2 years ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆72Updated last year
- I2M2: Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning (NeurIPS 2024)☆19Updated 6 months ago
- m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆26Updated last month
- ScrollNet for Continual Learning☆11Updated last year
- Clipora is a powerful toolkit for fine-tuning OpenCLIP models using Low Rank Adapters (LoRA).☆21Updated 9 months ago
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with…☆29Updated last month
- Implementation for "The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer"☆25Updated last week
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Updated last year
- ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)☆16Updated last year
- Course repository for the Spring 2023 COMP664 course "Deep Learning" at UNC☆14Updated 2 years ago
- Awesome Large Vision-Language Model: A Curated List of Large Vision-Language Model☆27Updated 7 months ago
- [CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings☆46Updated last year
- An Enhanced CLIP Framework for Learning with Synthetic Captions☆30Updated 3 weeks ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆91Updated 4 months ago
- In-the-wild Question Answering☆15Updated 2 years ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆88Updated last year
- Conference schedule, top papers, and analysis of the data for NeurIPS 2023!☆119Updated last year
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆66Updated last year
- The official GitHub page for paper "NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional St…☆22Updated last year
- Website☆53Updated 2 years ago