assafbk / mocha_codeLinks

Mitigating Open-Vocabulary Caption Hallucinations (EMNLP 2024)

☆17

Alternatives and similar repositories for mocha_code

Users that are interested in mocha_code are comparing it to the libraries listed below

Sorting:

THUNLP-MT / Brote
☆11Updated 9 months ago
bcdnlp / FAITHSCORE
FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models
☆30Updated 7 months ago
adobe-research / llava-score
☆11Updated last year
MrZilinXiao / AutoVER
[ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.
☆14Updated last year
JiwanChung / vlis
☆24Updated 2 years ago
ys-zong / VL-ICL
[ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
☆65Updated last month
junyangwang0410 / HaELM
An automatic MLLM hallucination detection framework
☆19Updated 2 years ago
sterzhang / PVIT
Official Repository of Personalized Visual Instruct Tuning
☆32Updated 8 months ago
patrick-tssn / VideoHallucer
VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)
☆38Updated 2 weeks ago
foundation-multimodal-models / CAL
[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
☆57Updated last year
UW-Madison-Lee-Lab / CoBSAT
Implementation and dataset for paper "Can MLLMs Perform Text-to-Image In-Context Learning?"
☆41Updated 5 months ago
DAMO-NLP-SG / CMM
✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
☆50Updated 3 months ago
bronyayang / HallE_Control
HallE-Control: Controlling Object Hallucination in LMMs
☆31Updated last year
shiqichen17 / VLM_Merging
Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)
☆80Updated last month
Kevinz-code / SeVa
[MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501
☆57Updated last year
FudanDISC / ReForm-Eval
An benchmark for evaluating the capabilities of large vision-language models (LVLMs)
☆45Updated last year
AIFEG / BenchLMM
[ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
☆86Updated last year
YuxiXie / V-DPO
Preference Learning for LLaVA
☆52Updated last year
YiyangZhou / POVID
[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
☆88Updated last year
edchengg / oven_eval
ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities
☆43Updated 5 months ago
yfzhang114 / LLaVA-Align
[ACM Multimedia 2025] This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual…
☆82Updated 8 months ago
yaolinli / DeCo
Code for DeCo: Decoupling token compression from semanchc abstraction in multimodal large language models
☆74Updated 3 months ago
yuezih / less-is-more
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
☆54Updated last year
AoiDragon / POPE
[EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆94Updated 2 months ago
YiyangZhou / CSR
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
☆80Updated 2 weeks ago
kaistAI / Volcano
[NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…
☆46Updated last year
SivanDoveh / DAC
Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models
☆27Updated last year
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆48Updated last year
AlignGPT-VL / AlignGPT
Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"
☆33Updated last year
csarron / PuMer
[ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
☆34Updated last year