Flowerfan / VistaLLaMALinks
☆15Updated 10 months ago
Alternatives and similar repositories for VistaLLaMA
Users that are interested in VistaLLaMA are comparing it to the libraries listed below
Sorting:
- Towards a Unified View on Visual Parameter-Efficient Transfer Learning☆26Updated 3 years ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆67Updated last year
- ☆14Updated 6 months ago
- Compress conventional Vision-Language Pre-training data☆52Updated 2 years ago
- Learning 1D Causal Visual Representation with De-focus Attention Networks☆35Updated last year
- This repository contains the code of our paper 'Skip \n: A simple method to reduce hallucination in Large Vision-Language Models'.☆14Updated last year
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆79Updated 11 months ago
- [AAAI 2023] Zero-Shot Enhancement of CLIP with Parameter-free Attention☆89Updated 2 years ago
- Turning to Video for Transcript Sorting☆48Updated 2 years ago
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆22Updated last year
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆33Updated 2 years ago
- [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models☆39Updated 8 months ago
- (NeurIPS 2024) What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights☆28Updated 11 months ago
- ☆26Updated 2 years ago
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"☆39Updated last week
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆19Updated last year
- [ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.☆73Updated 2 years ago
- [CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering☆49Updated 3 months ago
- [ICCV2023] Borrowing Knowledge From Pre-trained Language Model: A New Data-efficient Visual Learning Paradigm☆18Updated 2 years ago
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆57Updated last year
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆44Updated 10 months ago
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆16Updated 2 years ago
- (ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator☆114Updated 6 months ago
- Task Residual for Tuning Vision-Language Models (CVPR 2023)☆73Updated 2 years ago
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆10Updated 3 months ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆32Updated 2 years ago
- ☆81Updated 11 months ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆63Updated last year
- VisualGPTScore for visio-linguistic reasoning☆27Updated 2 years ago
- [CVPR 2025] PyTorch implementation of paper "FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training"☆31Updated 3 months ago