AttentionX / InstructBLIP_PEFTLinks

☆41

Alternatives and similar repositories for InstructBLIP_PEFT

Users that are interested in InstructBLIP_PEFT are comparing it to the libraries listed below

Sorting:

Jiaxuan-Li / EVCap
[CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
☆54Updated last year
AoiDragon / POPE
[EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆84Updated last year
YiyangZhou / LURE
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆145Updated last year
yuhui-zh15 / VLMClassifier
Official implementation of "Why are Visually-Grounded Language Models Bad at Image Classification?" (NeurIPS 2024)
☆85Updated 8 months ago
pkunlp-icler / MIC
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
☆47Updated last year
vinid / neg_clip
NegCLIP.
☆32Updated 2 years ago
Liuziyu77 / RAR
The official implementation of RAR
☆88Updated last year
opendatalab / HA-DPO
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
☆88Updated last year
chancharikmitra / CCoT
[CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"
☆132Updated last year
xing0047 / cca-llava
[NeurIPS 2024] Mitigating Object Hallucination via Concentric Causal Attention
☆56Updated 6 months ago
Becomebright / GroundVQA
Official PyTorch code of GroundVQA (CVPR'24)
☆61Updated 9 months ago
yfzhang114 / LLaVA-Align
This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strat…
☆78Updated 4 months ago
mrwu-mac / R-Bench
[ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'
☆21Updated 5 months ago
RUCAIBox / POPE
The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆209Updated last year
amazon-science / QA-ViT
☆67Updated 11 months ago
Yuqifan1117 / HalluciDoctor
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)
☆45Updated 11 months ago
RupertLuo / VoCoT
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
☆65Updated 11 months ago
yaolinli / DeCo
☆37Updated 11 months ago
JiuTian-VL / MoME
[NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
☆66Updated last month
ys-zong / VL-ICL
[ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
☆58Updated 4 months ago
Shengcao-Cao / groundLMM
Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision
☆41Updated 3 months ago
ZhengYu518 / VL-Mamba
Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"
☆81Updated last year
allenai / close
☆59Updated last year
lezhang7 / Enhance-FineGrained
[CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding
☆50Updated 2 months ago
Hui-design / Open-LLaVA-Video-R1
[LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)
☆29Updated last month
xuanlinli17 / large_vlm_distillation_ood
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability (ICCV 2023)
☆58Updated last year
yellow-binary-tree / HawkEye
Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos
☆43Updated last year
patrick-tssn / VideoHallucer
VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)
☆32Updated 2 months ago
vishaal27 / SuS-X
Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]
☆102Updated last year
CHENGY12 / PLOT
[ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
☆167Updated last year