d-ailin / CLIP-Guided-DecodingLinks

☆17

Alternatives and similar repositories for CLIP-Guided-Decoding

Users that are interested in CLIP-Guided-Decoding are comparing it to the libraries listed below

Sorting:

junyangwang0410 / HaELM
An automatic MLLM hallucination detection framework
☆19Updated 2 years ago
Yangyi-Chen / CoTConsistency
The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".
☆34Updated 2 years ago
JiwanChung / vlis
☆24Updated 2 years ago
yiren-jian / BLIText
[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
☆26Updated 2 years ago
SivanDoveh / DAC
Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models
☆27Updated 2 years ago
FreedomIntelligence / TRIM
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆19Updated last year
bcdnlp / FAITHSCORE
FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models
☆31Updated last week
mlfoundations / clip_quality_not_quantity
☆29Updated 3 years ago
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆83Updated last year
lscpku / VITATECS
☆18Updated last year
Yuqifan1117 / HalluciDoctor
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)
☆50Updated last year
GasolSun36 / MVP
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
☆23Updated last year
YuxiXie / V-DPO
Preference Learning for LLaVA
☆56Updated last year
HenryHZY / VL-PET
[ICCV2023] Official code for "VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control"
☆52Updated 2 years ago
muirbench / MuirBench
A Comprehensive Benchmark for Robust Multi-image Understanding
☆17Updated last year
uqzhichen / Awesome-compositional-zero-shot-learning
Paper list of compositional zero-shot learning
☆11Updated 3 years ago
sIncerass / MVLPT
code for "Multitask Vision-Language Prompt Tuning" https://arxiv.org/abs/2211.11720
☆57Updated last year
haoyiq114 / VALOR
Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)
☆16Updated last year
arijitray1993 / COLA
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆25Updated last year
jiaangli / VLCA
Do Vision and Language Models Share Concepts? A Vector Space Alignment Study
☆16Updated last year
kaistAI / Volcano
[NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…
☆46Updated last year
yfzhang114 / LLaVA-Align
[ACM Multimedia 2025] This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual…
☆82Updated 9 months ago
sangminwoo / AvisC
[ACL 2025 Findings] Official pytorch implementation of "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vis…
☆22Updated last year
archiki / RepARe
☆21Updated 2 years ago
YiyangZhou / LURE
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆154Updated last year
lancopku / clip-openness
[ACL 2023] Delving into the Openness of CLIP
☆23Updated 2 years ago
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆49Updated last year
eric-ai-lab / CPL
Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"
☆34Updated 3 years ago
LgQu / TIGeR
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Updated last year
yuezih / less-is-more
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
☆55Updated last year