Flowerfan / VistaLLaMALinks
☆14Updated 10 months ago
Alternatives and similar repositories for VistaLLaMA
Users that are interested in VistaLLaMA are comparing it to the libraries listed below
Sorting:
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆67Updated last year
- Compress conventional Vision-Language Pre-training data☆52Updated 2 years ago
- ☆26Updated 2 years ago
- ☆14Updated 7 months ago
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆57Updated last year
- [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models☆19Updated last year
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆31Updated last year
- FreeVA: Offline MLLM as Training-Free Video Assistant☆64Updated last year
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning☆20Updated last year
- Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)☆47Updated 3 weeks ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆32Updated 2 years ago
- Towards a Unified View on Visual Parameter-Efficient Transfer Learning☆26Updated 3 years ago
- Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision☆40Updated 2 weeks ago
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆20Updated last year
- Empowering Small VLMs to Think with Dynamic Memorization and Exploration☆15Updated last month
- The efficient tuning method for VLMs☆80Updated last year
- Task Residual for Tuning Vision-Language Models (CVPR 2023)☆73Updated 2 years ago
- ECCV24, NeurIPS24, Benchmarking Generalized Out-of-Distribution Detection with Vision-Language Models☆28Updated 10 months ago
- [ECCV 2024] Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models☆54Updated last year
- (ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning☆27Updated last year
- (ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator☆114Updated 7 months ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆45Updated 10 months ago
- Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.☆33Updated 2 years ago
- ☆31Updated last year
- [ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.☆74Updated 2 years ago
- (NeurIPS 2024) What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights☆29Updated last year
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆22Updated last year
- ☆20Updated 2 years ago
- This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"☆11Updated 2 months ago
- Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…☆22Updated 9 months ago