Flowerfan / VistaLLaMALinks

☆14

Alternatives and similar repositories for VistaLLaMA

Users that are interested in VistaLLaMA are comparing it to the libraries listed below

Sorting:

Paranioar / UniPT
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
☆67Updated last year
showlab / datacentric.vlp
Compress conventional Vision-Language Pre-training data
☆52Updated 2 years ago
UniAdapter / UniAdapter
☆26Updated 2 years ago
dengandong / GroundMoRe
☆14Updated 7 months ago
callsys / GenPromp
[ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization
☆57Updated last year
zjr2000 / REVERIE
[ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
☆19Updated last year
dhg-wei / TOPA
(NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment
☆31Updated last year
whwu95 / FreeVA
FreeVA: Offline MLLM as Training-Free Video Assistant
☆64Updated last year
IIGROUP / SCL
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
☆20Updated last year
rui-qian / READ
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆47Updated 3 weeks ago
TencentARC / FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
☆32Updated 2 years ago
bruceyo / V-PETL
Towards a Unified View on Visual Parameter-Efficient Transfer Learning
☆26Updated 3 years ago
Shengcao-Cao / groundLMM
Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision
☆40Updated 2 weeks ago
takomc / amp
【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"
☆20Updated last year
HKUST-LongGroup / DyME
Empowering Small VLMs to Think with Dynamic Memorization and Exploration
☆15Updated last month
lixinustc / GraphAdapter
The efficient tuning method for VLMs
☆80Updated last year
geekyutao / TaskRes
Task Residual for Tuning Vision-Language Models (CVPR 2023)
☆73Updated 2 years ago
YBZh / OpenOOD-VLM
ECCV24, NeurIPS24, Benchmarking Generalized Out-of-Distribution Detection with Vision-Language Models
☆28Updated 10 months ago
lloongx / DIKI
[ECCV 2024] Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models
☆54Updated last year
dhg-wei / MCL
(ICML 2024) Improve Context Understanding in Multimodal Large Language Models via Multimodal Composition Learning
☆27Updated last year
zhaohengyuan1 / Genixer
(ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator
☆114Updated 7 months ago
zycheiheihei / Transferable-Visual-Prompting
[CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…
☆45Updated 10 months ago
TencentARC / pi-Tuning
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
☆33Updated 2 years ago
Share14 / ShareGemini
☆31Updated last year
ziplab / SPT
[ICCV 2023 oral] This is the official repository for our paper: ''Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning''.
☆74Updated 2 years ago
CVMI-Lab / clip-beyond-tail
(NeurIPS 2024) What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights
☆29Updated last year
GasolSun36 / MVP
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
☆22Updated last year
wuw2019 / R-AMT
☆20Updated 2 years ago
HuiGuanLab / RaTSG
This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"
☆11Updated 2 months ago
mlvlab / VidChain
Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…
☆22Updated 9 months ago