umd-huang-lab / MementosLinks

☆31

Alternatives and similar repositories for Mementos

Users that are interested in Mementos are comparing it to the libraries listed below

Sorting:

Yangyi-Chen / CoTConsistency
The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".
☆34Updated 2 years ago
mlfoundations / VisIT-Bench
☆50Updated 2 years ago
psunlpgroup / VisOnlyQA
This repository contains the code and data for the paper "VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception o…
☆27Updated 3 months ago
YuxiXie / V-DPO
Preference Learning for LLaVA
☆51Updated 11 months ago
shizhediao / DaVinci
Source code for the paper "Prefix Language Models are Unified Modal Learners"
☆42Updated 2 years ago
HYPJUDY / Sparkles
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
☆44Updated last year
YiyangZhou / POVID
[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
☆88Updated last year
Hxyou / IdealGPT
Official Code of IdealGPT
☆35Updated 2 years ago
zwq2018 / Multi-modal-Self-instruct
The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…
☆83Updated 9 months ago
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆48Updated last year
princeton-nlp / PTP
Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073
☆31Updated last year
VisualWebBench / VisualWebBench
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
☆59Updated last year
kaistAI / Volcano
[NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…
☆46Updated last year
vlf-silkie / VLFeedback
☆100Updated last year
SihengLi99 / TextBind
[2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation
☆46Updated 2 years ago
bcdnlp / FAITHSCORE
FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models
☆30Updated 7 months ago
TIGER-AI-Lab / VisualWebInstruct
The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]
☆35Updated last month
junyangwang0410 / HaELM
An automatic MLLM hallucination detection framework
☆19Updated 2 years ago
haoyiq114 / VALOR
Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)
☆16Updated last year
guilk / KAT
Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"
☆68Updated 3 years ago
M3-IT / YING-VLM
Vision Large Language Models trained on M3IT instruction tuning dataset
☆17Updated 2 years ago
gzcch / Bingo
☆55Updated last year
DynaMath / DynaMath
A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
☆27Updated 11 months ago
OpenKG-ORG / EasyDetect
An Easy-to-use Hallucination Detection Framework for LLMs.
☆61Updated last year
tianyi-lab / Mosaic-IT
[ACL'25] Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning
☆20Updated last month
AGI-Edgerunners / IIL
Code for our Paper "All in an Aggregated Image for In-Image Learning"
☆29Updated last year
TobiasLee / VEC
Visual and Embodied Concepts evaluation benchmark
☆21Updated 2 years ago
Victorwz / VaLM
VaLM: Visually-augmented Language Modeling. ICLR 2023.
☆56Updated 2 years ago
dvlab-research / Mr-Ben
This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"
☆50Updated 11 months ago
open-compass / CriticEval
[NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs
☆47Updated 11 months ago