SuperBruceJia / Awesome-Large-Vision-Language-ModelLinks
Awesome Large Vision-Language Model: A Curated List of Large Vision-Language Model
☆27Updated 8 months ago
Alternatives and similar repositories for Awesome-Large-Vision-Language-Model
Users that are interested in Awesome-Large-Vision-Language-Model are comparing it to the libraries listed below
Sorting:
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆29Updated 8 months ago
- 🔥MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer [Official, ICLR 2023]☆21Updated last year
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆46Updated last week
- The official PyTorch implementation of the paper "MLAE: Masked LoRA Experts for Visual Parameter-Efficient Fine-Tuning"☆29Updated 6 months ago
- Code for CVPR2025 "MMRL: Multi-Modal Representation Learning for Vision-Language Models" and its extension "MMRL++: Parameter-Efficient a…☆42Updated 2 weeks ago
- [ACL 2023] Code for paper “Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation”(https://arxiv.org/abs/2305.…☆38Updated 2 years ago
- Awesome List of Vision Language Prompt Papers☆45Updated last year
- ☆17Updated 8 months ago
- Official repo of M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning☆23Updated 2 months ago
- CaMML:Context-Aware MultiModal Learner for Large Models (ACL 2024)☆15Updated 2 weeks ago
- LibMoE: A LIBRARY FOR COMPREHENSIVE BENCHMARKING MIXTURE OF EXPERTS IN LARGE LANGUAGE MODELS☆39Updated last month
- [ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning☆56Updated 4 months ago
- ☆25Updated 11 months ago
- Distributed Optimization Infra for learning CLIP models☆26Updated 8 months ago
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models☆65Updated last month
- ☆50Updated 4 months ago
- ☆77Updated 5 months ago
- visual question answering prompting recipes for large vision-language models☆26Updated 8 months ago
- ☆81Updated 2 months ago
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆58Updated 2 months ago
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"☆12Updated this week
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆75Updated last year
- [ICLR 2024 Spotlight] "Negative Label Guided OOD Detection with Pretrained Vision-Language Models"☆21Updated 7 months ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆56Updated 5 months ago
- ☆31Updated last month
- ☆18Updated 10 months ago
- PyTorch implementation of MCM (Delving into out-of-distribution detection with vision-language representations), NeurIPS 2022☆81Updated last year
- ☆20Updated 6 months ago
- [NeurIPS 2023] Official repository for "Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models"☆12Updated 11 months ago
- [NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model☆86Updated last year