SuperBruceJia / Awesome-Large-Vision-Language-Model
Awesome Large Vision-Language Model: A Curated List of Large Vision-Language Model
☆23Updated 5 months ago
Alternatives and similar repositories for Awesome-Large-Vision-Language-Model:
Users that are interested in Awesome-Large-Vision-Language-Model are comparing it to the libraries listed below
- I2M2: Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning (NeurIPS 2024)☆17Updated 4 months ago
- Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"☆13Updated last year
- Awesome List of Vision Language Prompt Papers☆45Updated last year
- [NeurIPS 2023] Make Your Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning☆31Updated last year
- ☆16Updated 5 months ago
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆21Updated 7 months ago
- ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2☆62Updated 3 months ago
- ☆41Updated last year
- ☆16Updated 7 months ago
- An Enhanced CLIP Framework for Learning with Synthetic Captions☆27Updated 3 months ago
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆29Updated 5 months ago
- ☆42Updated 2 months ago
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆19Updated 6 months ago
- Official repo of Progressive Data Expansion: data, code and evaluation☆28Updated last year
- Official Code for ICML 2023 Paper: On the Generalization of Multi-modal Contrastive Learning☆25Updated last year
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated 6 months ago
- ☆43Updated 5 months ago
- visual question answering prompting recipes for large vision-language models☆24Updated 6 months ago
- [BMVC 2022] Information Theoretic Representation Distillation☆18Updated last year
- Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.☆27Updated 7 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆52Updated 6 months ago
- Official code repository for paper: "ExPLoRA: Parameter-Efficient Extended Pre-training to Adapt Vision Transformers under Domain Shifts"☆31Updated 5 months ago
- ☆22Updated 5 months ago
- MuCR is a benchmark designed to evaluate Multimodal Large Language Models' (MLLMs) ability to discern causal links across modalities☆14Updated last month